AIR#137 - July 22, 2024

Good morning, AI aficionados! As you sip your morning coffee, get ready to dive into today's edition of AIR: The AI Recon. Leading the headlines is a fascinating revelation from MIT, where a new AI method is radically speeding up predictions of materials' thermal properties by up to 1,000 times. This breakthrough promises to revolutionize energy efficiency and microelectronics, making it a must-read for anyone interested in cutting-edge advancements in AI and material science.

But that's not all! In a surprising twist, a study reveals that AI training data is rapidly disappearing as web sources tighten access. This poses significant challenges for AI development and research, highlighting the delicate balance between data availability and privacy. If you're intrigued by the implications of data restrictions on AI progress, this story is a compelling read. Meanwhile, DeepL's latest language model has outperformed giants like Google Translate and ChatGPT-4, setting a new standard in translation quality and productivity. Business leaders and tech enthusiasts will find this development particularly exciting as it reshapes the competitive landscape.

And for a touch of the uncanny, a deep dive into ChatGPT's summarization capabilities reveals that it often misses the mark, shortening text without truly grasping core ideas. This story is a thought-provoking reminder of the limitations and challenges that still exist in AI development. Whether you're here for groundbreaking tech updates, ethical debates, or the latest industry buzz, today's edition is packed with stories that will both intrigue and challenge you. So, sit back, sip your coffee, and let's delve into the dynamic world of artificial intelligence together!

Business

DeepL's LLM Surpasses Google Translate, ChatGPT-4, and Microsoft
DeepL's new LLM surpasses Google Translate, ChatGPT-4, and Microsoft in translation quality, requiring fewer edits and boosting productivity.

OpenAI's 5 Levels of 'Super AI' (AGI to Outperform Human Capability)
OpenAI outlines 5 levels of "super AI" from conversational to organizational, aiming to surpass human capabilities. Currently at level 2.

Chinese Companies Use AI to 'Resurrect' Dead Loved Ones
Chinese tech firms use AI to create avatars of deceased loved ones, raising ethical questions about digital resurrection.

Facebook and Instagram Algorithms Push Sexism and Misogyny to Blank Accounts
Blank Facebook and Instagram accounts still receive sexist and misogynistic content, revealing troubling algorithmic biases.

GPs use AI to boost cancer detection rates in England by 8%
AI boosts cancer detection in England by 8%, helping GPs spot more cases early with the "C the Signs" tool.

Engineering

AI Method Radically Speeds Predictions of Materials' Thermal Properties | MIT News
MIT's new AI method predicts materials' thermal properties up to 1,000x faster, boosting energy efficiency and microelectronics.

[GitHub] Prelude: Tiny CLI Tool for Building LLM Context Prompts from Code
New GitHub tool "Prelude" helps create context prompts from code repositories for LLMs, simplifying code analysis and improvement.

πŸ”₯ [Github] txtai: Minimalist Open-source Vector Search and RAG
txtai: Open-source embeddings database for semantic search, LLM workflows, and RAG. Supports text, audio, images, and video.

RAG for Massive Codebases: Tackling 10k Repos with CodiumAI
CodiumAI uses RAG to manage 10k+ repos, improving code quality and context for enterprise-scale codebases.

AI Impedes Learning Web Development
AI hinders learning web development by providing shortcuts, preventing foundational understanding and mental model formation.

The Limitations of LLMs and the Rise of RAG
LLMs struggle with proprietary data, but RAG enhances them by integrating specific, relevant information for more accurate responses.

LLMs Solve Intractable Problems: A Real-World Success Story
LLMs solve real-world problems! From podcast transcription to speaker ID, AI simplifies complex tasks. Check out our success story!

[GitHub] Master Any Code: One-Click Comments and Language Conversion in VSCode
Master code in VSCode effortlessly with one-click comments and language conversion via GitHub's Aide extension! πŸ’ͺ

[GitHub] Merlinn: Open-Source AI On-Call Developer for Instant Incident Analysis
GitHub's Merlinn: Open-source AI on-call developer for instant incident analysis, making engineers 10x more efficient. πŸ§™β€β™‚οΈπŸŽοΈ

[Github] Quantized BGE-M3 Embedding Model in Browser for Full Privacy
Run the Quantized BGE-M3 embedding model in-browser for multilingual privacy, no external servers needed. Check GitHub for details!

Beyond the Hype: A Realistic Look at Large Language Models β€’ Jodie Burchell β€’ GOTO 2024
Jodie Burchell tackles the hype vs. reality of LLMs at GOTO 2024, exploring their true capabilities, limitations, and practical uses.

ChatGPT-4o vs. CatDog: Testing OpenAI’s Multimodal Marvel
Testing ChatGPT-4o's multimodal abilities to create a "catdog" reveals its creative potential but struggles with consistency.

[GitHub] Lunarring's Lunar Tools: Interactive AI Exhibit Toolkit
Lunar Tools by Lunarring: A toolkit for creating interactive AI exhibitions, now available on GitHub.

[Github] Kompute: GPU Compute Framework for Cross-Vendor Graphics Cards
Kompute: A blazing-fast, cross-vendor GPU compute framework for advanced ML, mobile, and game development, backed by the Linux Foundation.

Redis Dominates Vector Database Benchmarks
Redis tops vector database benchmarks with new Query Engine, boasting up to 62% more throughput and 4x lower latency than competitors.

Academic

πŸ”₯ When ChatGPT Summarizes, It's Not Actually Summarizing
ChatGPT doesn't truly summarize; it shortens text without grasping core ideas, often missing key points and proposals.

πŸ”₯ The data that powers AI is disappearing fast
AI training data is vanishing as web sources restrict access, posing challenges for AI development and research, a new study reveals.

[Paper] Artificial Consciousness and the Free Energy Principle
Artificial consciousness may require more than just computational replication; it might need specific causal flows akin to living organisms, per the Free Energy Principle.

[Paper] Refusal Training in LLMs Fails with Past Tense Requests
Refusal training in LLMs fails with past tense requests, revealing a significant vulnerability. Fine-tuning can mitigate this issue.

[Paper] Unexpected Benefits of Self-Modeling in Neural Systems
Self-modeling in neural networks reduces complexity and increases efficiency, making them simpler and more predictable.

Read more