Applications are invited for an online internship opportunity with OpenNyAI from July 20 to August 20. Apply by July 15!
About the Organisation
The OpenNyAI mission, created in 2021, is a collaborative mission with an aim of catalysing exponentially better justice solutions by harnessing the power of AI.
The mission’s members include Agami, Thoughtworks, Ekstep, and the National Law School of India University, Bangalore. Agami, which steers the OpenNyAI mission, is geared towards discovering and catalysing innovation in the field of law and justice.
OpenNyAI works with a diverse network of legal professionals, technologists and academic institutions. The goal is to encourage the integration of AI in the justice sector by developing the necessary tools and addressing the challenges posed by the process.
By combining their network’s expertise, the partners within OpenNyAI are actively working towards establishing standardised testing methodologies currently lacking in indigenous AI development, bridging a crucial gap in technology evaluation.
About the Project
Artificial Intelligence (AI) technologies are increasingly integrated into various sectors, including the legal field. RAG (Retriever Augmented Generation) systems are designed to enhance the accuracy of AI responses.
AI often “hallucinates” or generates answers that are inaccurate. A RAG system comprises two main elements: a retriever and a generator. The retriever fetches information from an existing database, and the generator uses this information to generate a response. Since the database is narrow, a RAG system produces more accurate answers, reducing the possibility of hallucinations.
However, the evaluation of these technologies poses significant challenges due to the lack of a unified framework. Current evaluation methods are often model-specific, and metrics used for assessing Large Language Models (LLMs) may not be directly applicable to RAG systems.
This highlights the need for standardised testing methodologies, especially in indigenous AI development contexts where such frameworks are absent.
To address the challenges of AI evaluation, our program aims to create a comprehensive legal benchmarking dataset in India. This dataset will be used to test RAG systems on two key metrics: recall and accuracy.
Recall measures the system’s ability to retrieve all relevant information, while accuracy assesses the correctness of the information retrieved. The creation of this dataset will involve a diverse cohort of students from various disciplines, including law, social sciences, gender studies, data sciences, and humanities. The goal is to foster a dataset that supports the development of Responsible AI by reflecting a wide range of perspectives and expertise.
The OpenNyAI mission is looking for volunteers to help in our legal dataset compilation. This dataset will be used to measure the accuracy of an AI framework: the RAG system. To register your interest, we require you to submit a task (~15m) detailed below.
Number of Volunteers Required
200 people
Location
Remote
Duration of Internship
2-3 hours every weekday from 20th July to 20th August
Who can Apply
- Law Students: To participate in our annotating exercises, as a significant portion of the work requires legal aptitude. We are preferably looking for 2nd – 5th year law students.
- Students from various disciplines: Including social sciences, gender studies, data sciences, engineering, and humanities. We aim to create a dataset that lends itself to Responsible AI, and we’re looking for as much diversity as possible.
Incentives
Participants in this program will benefit from weekly interactions with experts in different related fields through scheduled talks. They will receive certificates recognizing their contribution, opportunities for future engagement with our communities, and participate in collaborative meetings to foster peer interaction and learning.
Most importantly, contributors will play a crucial role in developing a legal benchmark dataset that could become a standard tool for testing legal AI applications.
Program Details
Students are provided with different datasets, and a login to a portal that allows them to simulate questions, annotate parts of the document that contain the answer, and write out their answer. The Jugalbandi retriever is then run—on the same question—along with two other retrievers.
Students will have to compare the chunks of information pulled up by all three retrievers to determine their relevance to the question. Relevant chunks are annotated by the students, who then revise their answers based on the new information. These revised answers will be used as ground truths to measure the accuracy of the RAG.
Deadline
15th July
Contact
Email: aayana@agami.in; smita@agami.in