Summer of open-source with Whissle and Red Hen Lab

By Karan Singla

Aug 31 2024

50

35

At Whissle, our commitment to open-source development is not just a business strategy; it is ingrained in our identity. Our journey, intertwined with the efforts of Red Hen Lab, reflects this deep-rooted connection, particularly through our participation in the Google Summer of Code (GSoC) 2024. This year, our team collaborated with three talented students who contributed to advancing the field of multilingual language models, Retrieval-Augmented Generation (RAG) pipelines, and visual-aware speech recognition.

[@portabletext/react] Unknown block type "image", specify a component for it in the `components.types` prop

1. Multilingual News LLM Fine Tuning link

Our first blog highlights Tarun Jain's work on developing a Large Language Model (LLM) using Red Hen's extensive news archives. By leveraging datasets across six languages, including English, Spanish, French, German, and Portuguese, Tarun successfully trained specialized reporter-like models that enhance multilingual AI for journalism. The models, optimized through advanced techniques like GGUF quantization, are now available on HuggingFace, democratizing access to powerful AI tools for the open-source community.

2. Developing an RAG Pipeline and Guardrails for Data Security link

Yufei Gao's blog showcases efforts in developing a multilingual RAG pipeline, a crucial component for information retrieval in unseen data. The work emphasized the importance of data security by implementing guardrails to ensure context alignment and answer accuracy. Despite the challenges faced in achieving high context recall, the project set the foundation for more secure and reliable RAG implementations, contributing valuable insights and tools back to the open-source ecosystem.

3. Visual-Aware Speech Recognition in Noisy Settings link

Finally, our third blog from Lakshmipathi Balaji delves into the innovative work on integrating visual cues into speech recognition models to tackle the challenge of transcribing speech in noisy environments. By combining audio and visual data, this approach significantly improves transcription accuracy, even in challenging scenarios like crowded rooms. This project represents a forward-thinking shift in audio processing technology, with broad applications from automated transcription services to enhanced hearing aids.

Our involvement in these open-source initiatives is a testament to Whissle's dedication to fostering collaboration and innovation in the AI community. The advancements made during GSoC 2024 not only contribute to our mission but also empower developers worldwide, further solidifying our role in the open-source landscape.

Popular Tags :
Share this post :