AIR#200 - Boosting AI: Contextual Retrieval & Self-Correcting Models! āš”

Hey there!

Here's the latest AI news for today. Enjoy!

Today's top stories

šŸ”„ Contextual Retrieval
Anthropic introduces Contextual Retrieval, enhancing RAG by improving retrieval accuracy with contextual embeddings and BM25.

šŸ”„ Training Language Models to Self-Correct via Reinforcement Learning
A new reinforcement learning method, SCoRe, enhances self-correction in language models, improving performance by up to 15.6%.

šŸ”„ Three Mile Island nuclear plant restart in Microsoft AI power deal
Microsoft and Constellation Energy plan to restart Three Mile Island nuclear plant to meet rising electricity demands for AI.

šŸ”„ Federal civil rights watchdog sounds alarm over Feds use of facial recognition
Federal watchdog warns that facial recognition use by DOJ, DHS, and HUD lacks oversight, risking civil rights violations.

šŸ”„ MemoRAG ā€“ Enhance RAG with memory-based knowledge discovery for long contexts
MemoRAG enhances retrieval-augmented generation with a memory-based interface, improving context understanding and response accuracy.

Show HN: Open-source text classification CLI ā€“ train models with no labeled data
Open-source CLI tool aiq enables text classification and model training without labeled data, leveraging LLM APIs for labeling.

New AI diffusion model approach solves the aspect ratio problem
Rice University researchers developed ElasticDiffusion, a new AI model that effectively generates images with varying aspect ratios.

Show HN: LeanRL: Fast PyTorch RL with Torch.compile and CUDA Graphs
LeanRL optimizes PyTorch RL scripts using torch.compile and cudagraphs for faster training and improved performance.

Notes on Using LLMs for Code
Simon Willison discusses using LLMs for coding, highlighting their role in rapid prototyping and enhancing developer productivity.

Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models
The paper presents "late chunking," a method for better contextual chunk embeddings using long-context models, enhancing retrieval tasks.

Re-opened Three Mile Island will power AI data centers under new deal
Microsoft's deal to re-open Three Mile Island will supply energy for AI data centers, marking a nuclear energy revival.

The Snake Oil Salesman's Promises for Tesla's Full Self-Driving AI
Tesla's Full Self-Driving AI faces skepticism as promises from its proponents echo snake oil salesmanship.

The Sobering Reality of AI: A Researcher's Perspective
A researcher reveals AI's hype is misleading, citing a mere 10% success rate and consistent failures in simple tasks.

Show HN: MinDB ā€“ an extremely memory-efficient vector database
MinDB is a highly memory-efficient vector database, enabling indexing of 100M vectors with only 3GB RAM, outperforming traditional databases.

BFCL V3: LLM Multi-Turn and Multi-Step Function Calling Evaluation
BFCL V3 enhances LLM evaluation with multi-turn and multi-step function calling, improving interaction and task execution accuracy.

Self-Supervised Learning at ECCV 2024
The ECCV 2024 workshop on Self-Supervised Learning explores innovative techniques to enhance data efficiency and model interpretability.

Enhancing multilingual voice toxicity detection
Roblox Research develops AI for improved multilingual voice toxicity detection in gaming environments.

Try Out OpenAI O1 in GitHub Copilot and Models
OpenAI launches o1-preview and o1-mini for GitHub Copilot, enhancing coding with advanced reasoning and problem-solving capabilities.

Microsoft Reopening Three Mile Island to Power AI
Microsoft will restart Three Mile Island to power its AI data centers, boosting clean energy and supporting decarbonization efforts.

Operationalizing AI with Postgres: Vector Databases and More
Postgres enhances AI capabilities through vector databases, enabling efficient data management and operationalizing AI solutions.

Transcript for Yann LeCun: AGI and the Future of AI ā€“ Lex Fridman Podcast
Yann LeCun discusses AI's future, emphasizing open-source development, the limits of LLMs, and the potential for AGI through innovative architectures.

Stop Bullying LLMs
A call to end bullying of language models, emphasizing respect and ethical treatment in AI interactions.

We accidentally created the perfect DB for AI agents
A perfect database for AI agents was unintentionally created, showcasing unexpected potential in tech development.

Read more