Research Interests

🧠 Research Interests

My research centers on the intersection of large language models (LLMs), trustworthy NLP, and patent analytics, with a focus on developing methodologies that are both technically rigorous and practically valuable in high-stakes domains.


1. Trustworthy Text Generation in High-Stakes Domains

In critical fields such as intellectual property, biomedical informatics, and legal tech, the reliability of AI-generated content is paramount. I focus on:

  • Designing structure-aware and legally-consistent evaluation metrics for patent claims.
  • Exploring robustness, hallucination avoidance, and domain fidelity in LLM-based generation pipelines.
  • Benchmarking LLMs against expert-written claims to quantify generation quality in functional, structural, and legal dimensions.

📌 Key contribution: Development of multi-dimensional evaluation frameworks (e.g., PatentScore) for patent claim assessment.


2. Domain-Specific Evaluation of LLMs

Generic NLG metrics often fail to capture the nuances required in specialized fields. My work involves:

  • Creating domain-specific evaluation frameworks that go beyond surface-level similarity.
  • Incorporating legal validity, antecedent clarity, structural soundness, and domain coverage into automated scoring pipelines.
  • Leveraging expert annotations and weak/strong supervision techniques for fine-grained regression models.

3. Patent Intelligence and Strategic Innovation Analysis

Patents are rich, underutilized sources of technological and strategic insights. I apply NLP and data science techniques to:

  • Extract innovation trajectories, emerging technologies, and market-entry signals from patent corpora.
  • Model patent value using LLMs through pairwise preference, semantic similarity, and legal structure embeddings.
  • Link patent information to business decisions, M&A strategies, and startup innovation positioning.

If you are working in related areas or are interested in collaboration, feel free to contact me.