SENTINEL Lab

Language Models and Neuro-Logic Reasoning

Large Language Models LLMs have established state-of-the-art results on several generation and reasoning benchmark tasks. Despite their success, LLMs struggle with logical reasoning, or maintaining logical consistency during generation. Further, LLMs to date also suffer from hallucination, where the model generates inaccurate or false content that is difficult to detect by humans, and perform poorly in tasks that require long-term memory and reasoning over temporal dynamics in text. Lastly, open-access LLMs such as GPT also suffer from adversarial attacks such as prompt ejection and can store user data in third-party servers. This prohibits the model usage for processing confidential and sensitive data. To address the above issues, in this research, we employ hybrid neuro-symbolic logical reasoning techniques to ground LLM text generation explicitly with logic.

First-Order Logic Representation of Natural Language Text

Despite their success, LLMs struggle with logical reasoning, or maintaining logical consistency during generation. Obtaining such a capability requires LLMs accurately represent (or translate) the natural language (NL) statement into first-order logic (FOL) rules, which remains a challenge even for the most powerful LLMs to this date, such as ChatGPT. To this end, we propose to train a LLaMA-7B model for NL-FOL translation and fine-tune with LoRA. To do this, we set up a pipeline that automatically collects NL-FOL pairs from GPT-4 and verifies it to ensure high quality. We then propose a supervised-fine-tuning + reinforcement-learning-with-human-feedbacks framework to align the model's output FOL rule to the NL statement. Having such a model enables a variety of logic-based AI applications such as natural language theorem proving, logic-grounded natural language inference, and QA.

Logical Grounding of Large Language Models

A common limitation of Large Language Models LLMs to date is hallucination, where the model generates inaccurate or false content that is difficult to detect by humans. Moreover, LLMs perform poorly in tasks that require long-term and complex reasoning. For story-telling and code generation, LLMs are limited to the context window and can generate inconsistent content as the previous generation are no longer in the window. To address these issues, we propose a new transformer architecture that can access structured memory and be able to perform logic reasoning and reflect in a longer context during the generation via external tools such as inductive logic programming (ILP) reasoner and SMT solver.

SENTINEL Research Lab