A Journey to Smarter and Faster Legal Research


About Tax-Fin-Lex

With their commitment to high-quality services and a continuous pursuit of best practices, Tax-Fin-Lex has become synonymous with reliability, expertise, and adaptability. Through its multidisciplinary approach, combining experts in accounting, law, information technology, and tax legislation, Tax-Fin-Lex successfully integrates technological and legal aspects of business, providing clients with comprehensive support in managing tax obligations and financial processes.

Benefiting from ongoing investments in research and development, Tax-Fin-Lex has excelled in crafting innovative technological solutions, empowering clients to streamline tax and financial operations, thus enhancing efficiency. Through advanced platforms and tailored applications designed to meet specific client requirements, Tax-Fin-Lex demonstrates an unwavering commitment to exploring novel solutions and furnishing its clients with a competitive edge.


The complexity of legal documentation

Tax-Fin-Lex was faced with a challenge in improving the user experience on their platform. A key part of this was making it easier for users to find the information they needed. Starting off, their search function only worked with specific keywords, which can be limiting. To address this, they brought us on board to upgrade their search system, allowing users to ask questions using natural language.  The upgrade from specific keywords to questions using natural language allows users for a faster and easier use of the chat, not having to use precise keywords and slows conversing on a simpler level.

Navigating through legal documents like legislation, court decisions, and expert publications posed a particular difficulty. Legal language is often dense and complex, making it hard to quickly find relevant information. To tackle this, we used artificial intelligence to identify and extract the most important parts of judgments. This not only makes searching faster but also generates concise summaries, making it easier for legal professionals to get the information they need. For example, if searching for an answer to a question they not only receive the answer instantly without having to search for it through several documents, while also receiving a summary of the section where the answer is written.

Additionally, integrating chat-based interaction using GPT technology added another layer of complexity. Allowing users to ask questions naturally required advanced language processing to understand and respond effectively. This feature represents a significant step forward in user engagement and accessibility, reflecting Tax-Fin-Lex’s commitment to providing cutting-edge solutions for legal professionals.

In summary, these challenges fall under the broader RAG Retrieval-Augmented Generation problem. By working together and integrating innovative technology, Tax-Fin-Lex aims to overcome these obstacles and redefine how legal information is managed.


Solving the issue

The project faced significant complexity primarily due to the large volume and variety of documents available for processing. Tax-Fin-Lex’s database contains a wealth of legal materials, including legislation, court judgments, and expert texts, making it a substantial task to manage and extract relevant information effectively.

Additionally, before analysis could begin, the texts required thorough cleaning and formatting to ensure consistency and accuracy. This preparatory step was crucial to optimize the subsequent processing stages.

However, the most challenging aspect of the project was identifying key entities within the documents and understanding the context in which they were referenced. This task demanded advanced linguistic and semantic analysis capabilities to identify entities that appear and are important across documents whilst also understanding their influence on a specific case.

Moreover, developing a robust and scalable system capable of responding to queries in natural language presented another significant challenge. Crafting an intuitive interface that could accurately interpret user queries and provide relevant responses required sophisticated algorithmic approaches and extensive testing. The interface is split amongst different cases of use based on the type of questions imposed and the legislation they apply to.

Addressing these complexities required a multidisciplinary approach, combining expertise in linguistics, artificial intelligence, and software engineering. By leveraging state-of-the-art technologies and meticulous attention to detail, the project aimed to deliver a comprehensive solution tailored to the specific needs of Tax-Fin-Lex and its users.

Below, you’ll find just a sampling of the questions that the ChatBot can tackle. This should give you a good idea of what it’s capable of and how useful it can be:


– What should be included in an employment contract?
– How to deliver a termination notice to an employee?
– What should I, as an employer, pay attention to when issuing a termination?

– What is real estate? (General) article
– Can maintenance work be done on a monument?
– Can I follow my dog ​​that has escaped onto someone else’s property? (Due to our programing, the dog is recognized as a domestic animal and the chatbt can provide the correct response.)

– The defendant resides at an unknown address abroad. The court has repeatedly tried to serve me with a summons for questioning, but unsuccessfully. Can detention be ordered under Article 201 of the Civil Procedure Act due to mutual suspicion (only according to case law)?

– @legalcode   by chosing the correct legislation, the answer can be more precise) – Is the seller liable for the condition of the product?

– @date (by chosing the date, the asnwer will corelate the current legislation)  – What is the fine for exceeding the speed limit in a residential area by 15 km/h?


Strategies for Retrieval-Augmented Generation

One of the primary challenges we encountered in our project revolved around efficiently retrieving relevant textual excerpts for the chatbot’s responses while ensuring the smooth operation of the RAG system. This encompassed the need to navigate through a vast repository of legal documents within Tax-Fin-Lex’s database, including legislation, court judgments, and expert publications.

 1. Automatic Entity Recognition: To streamline the extraction of pertinent information, we implemented sophisticated Natural Language Processing (NLP) models. These models were utilized to automatically identify entities within the text, such as mentions of articles and names of laws. By automating this process, we aimed to expedite the retrieval of relevant textual excerpts from the extensive database.


  • Natural Language Processing (NLP) Models:
    We relied on cutting-edge Natural Language Processing (NLP) models to automate the process of entity recognition within Tax-Fin-Lex’s extensive database of legal documents. These models, including techniques such as Named Entity Recognition (NER) and part-of-speech tagging, helped us identify key entities within the text, such as articles and law names. By leveraging these advanced NLP capabilities, we made information extraction more efficient, streamlining the search and retrieval process for users.

 2. Embedding Comparison: In the subsequent stage, we employed advanced embedding comparison techniques to assess the semantic similarity between the user’s search query and the textual excerpts stored in the database. This involved utilizing embedding search functionalities to retrieve the top N documents that closely resembled the user’s query, thereby enhancing the relevance of the search results.

  • Embedding Comparison Techniques:
    Alongside NLP models, we utilized sophisticated embedding comparison techniques to evaluate the semantic similarity between user queries and the textual excerpts stored in the database. Embedding representations, derived from deep learning models, encode semantic information in a compact vector space. By comparing these embeddings, we could measure the semantic similarity between documents and user queries, improving the relevance of search results and facilitating a deeper understanding of the text’s underlying semantic relationships.
  • Weaviate Vector Database:
    To handle the vast volume of legal documents in Tax-Fin-Lex’s database, we employed Weaviate, a highly optimized vector database. Weaviate specializes in handling large-scale vector data and enables lightning-fast searches across extensive datasets. By indexing textual documents as vectors in Weaviate, we achieved rapid and efficient information retrieval. Additionally, Weaviate’s flexible query capabilities allowed us to fine-tune the search process, striking a balance between search accuracy and speed to ensure an optimal user experience.


 3. Response Generation: Building upon the retrieved documents, our system was designed to generate responses in natural language using cutting-edge language generation models such as chat GPT. By synthesizing information from the identified documents, the chatbot could furnish users with comprehensive and contextually relevant answers to their queries.

  • Chat GPT:
    For generating responses in natural language, we integrated Chat GPT, an advanced language generation model developed by OpenAI. Trained on a vast corpus of conversational data, Chat GPT excels at understanding and generating human-like text. By leveraging Chat GPT’s capabilities, we could provide contextually relevant responses to user queries, enhancing user engagement and comprehension. Moreover, Chat GPT enabled us to effectively communicate complex legal concepts in a clear and accessible manner, further improving the usability of our platform for users.

Insightful responses in milliseconds

Our efforts to improve the search functionality has yielded significant results, fundamentally transforming user interactions with Tax-Fin-Lex’s platform. We have successfully indexed the entirety of legislation, court judgments, and expert documents, totaling over a million records. Impressively, searches across this extensive database started off quite slowly but with our finetuning now produce results in mere milliseconds, thanks to the efficient indexing and retrieval mechanisms we’ve implemented.

User feedback has been overwhelmingly positive, with users expressing appreciation for the newfound ease and speed of accessing information. By transitioning from keyword-based searches to semantic searches, users can now explore concepts and ideas more efficiently, significantly enhancing their information retrieval process.

While the exact numbers are confidential, our innovative solution has not only boosted user satisfaction but has also led to a noticeable increase in subscribers to Tax-Fin-Lex’s services. The uniqueness and added value of our solution have set Tax-Fin-Lex apart from competitors, attracting new users seeking innovative approaches to legal information management.

Looking ahead, our chatbot continues to evolve, with ongoing efforts focused on further improving search and comprehension of complex legal challenges. Our aim remains to empower legal professionals to be more efficient in their work and to pique the curiosity of users interested in delving deeper into legal matters. Through continuous innovation and refinement, we aspire to solidify Tax-Fin-Lex’s position as a leader in providing cutting-edge solutions for legal information retrieval and analysis.