, , , ,

AI2 is developing a large language model optimized for science

PaLM 2. GPT-4. The list of text-generating AI practically grows by the day.

Most of these models are walled behind APIs, making it impossible for researchers to see exactly what makes them tick. But increasingly, community efforts are yielding open source AI that’s as sophisticated, if not more so, than their commercial counterparts.

The latest of these efforts is the Open Language Model, a large language model set to be released by the nonprofit Allen Institute for AI Research (AI2) sometime in 2024. Open Language Model, or OLMo for short, is being developed in collaboration with AMD and the Large Unified Modern Infrastructure consortium, which provides supercomputing power for training and education, as well as Surge AI and MosaicML (which are providing data and training code).

“The research and technology communities need access to open language models to advance this science,” Hanna Hajishirzi, the senior director of NLP research at AI2, told TechCrunch in an email interview. “With OLMo, we are working to close the gap between public and private research capabilities and knowledge by building a competitive language model.”

One might wonder — including this reporter — why AI2 felt the need to develop an open language model when there’s already several to choose from (see Bloom, Meta’s LLaMA, etc.). The way Hajishirzi sees it, while the open source releases to date have been valuable and even boundary-pushing, they’ve missed the mark in various ways.

AI2 sees OLMo as a platform, not just a model — one that’ll allow the research community to take each component AI2 creates and either use it themselves or seek to improve it. Everything AI2 makes for OLMo will be openly available, Hajishirzi says, including a public demo, training data set and API, and documented with “very limited” exceptions under “suitable” licensing.

“We’re building OLMo to create greater access for the AI research community to work directly on language models,” Hajishirzi said. “We believe the broad availability of all aspects of OLMo will enable the research community to take what we are creating and work to improve it. Our ultimate goal is to collaboratively build the best open language model in the world.”

OLMo’s other differentiator, according to Noah Smith, senior director of NLP research at AI2, is a focus on enabling the model to better leverage and understand textbooks and academic papers as opposed to, say, code. There’s been other attempts at this, like Meta’s infamous Galactica model. But Hajishirzi believes that AI2’s work in academia and the tools it’s developed for research, like Semantic Scholar, will help make OLMo “uniquely suited” for scientific and academic applications.

“We believe OLMo has the potential to be something really special in the field, especially in a landscape where many are rushing to cash in on interest in generative AI models,” Smith said. “AI2’s unique ability to act as third party experts gives us an opportunity to work not only with our own world-class expertise but collaborate with the strongest minds in the industry. As a result, we think our rigorous, documented approach will set the stage for building the next generation of safe, effective AI technologies.”

That’s a nice sentiment, to be sure. But what about the thorny ethical and legal issues around training — and releasing — generative AI? The debate’s raging around the rights of content owners (among other affected stakeholders), and countless nagging issues have yet to be settled in the courts.

To allay concerns, the OLMo team plans to work with AI2’s legal department and to-be-determined outside experts, stopping at “checkpoints” in the model-building process to reassess privacy and intellectual property rights issues.

“We hope that through an open and transparent dialogue about the model and its intended use, we can better understand how to mitigate bias, toxicity, and shine a light on outstanding research questions within the community, ultimately resulting in one of the strongest models available,” Smith said.

What about the potential for misuse? Models, which are often toxic and biased to begin with, are ripe for bad actors intent on spreading disinformation and generating malicious code.

Hajishirzi said that AI2 will use a combination of licensing, model design and selective access to the underlying components to “maximize the scientific benefits while reducing the risk of harmful use.” To guide policy, OLMo has an ethics review committee with internal and external advisors (AI2 wouldn’t say who, exactly) that’ll provide feedback throughout the model creation process.

We’ll see to what extent that makes a difference. For now, a lot’s up in the air — including most of the model’s technical specs. (AI2 did reveal that it’ll have around 70 billion parameters, parameters being the parts of the model learned from historical training data.) Training’s set to begin on LUMI’s supercomputer in Finland — the fastest supercomputer in Europe, as of January — in the coming months.

AI2 is inviting collaborators to help contribute to — and critique — the model development process. Those interested can contact the OLMo project organizers here

AI2 is developing a large language model optimized for science by Kyle Wiggers originally published on TechCrunch

https://techcrunch.com/2023/05/11/ai2-is-developing-a-large-language-model-optimized-for-science/


November 2024
M T W T F S S
 123
45678910
11121314151617
18192021222324
252627282930  

About Us

Welcome to encircle News! We are a cutting-edge technology news company that is dedicated to bringing you the latest and greatest in everything tech. From automobiles to drones, software to hardware, we’ve got you covered.

At encircle News, we believe that technology is more than just a tool, it’s a way of life. And we’re here to help you stay on top of all the latest trends and developments in this ever-evolving field. We know that technology is constantly changing, and that can be overwhelming, but we’re here to make it easy for you to keep up.

We’re a team of tech enthusiasts who are passionate about everything tech and love to share our knowledge with others. We believe that technology should be accessible to everyone, and we’re here to make sure it is. Our mission is to provide you with fun, engaging, and informative content that helps you to understand and embrace the latest technologies.

From the newest cars on the road to the latest drones taking to the skies, we’ve got you covered. We also dive deep into the world of software and hardware, bringing you the latest updates on everything from operating systems to processors.

So whether you’re a tech enthusiast, a business professional, or just someone who wants to stay up-to-date on the latest advancements in technology, encircle News is the place for you. Join us on this exciting journey and be a part of shaping the future.

Podcasts

TWiT 1006: Underwater Alien Civilizations – Bluesky Growth, Tyson Vs. Paul, AI Granny This Week in Tech (Audio)

How Bluesky, Alternative to X and Facebook, Is Handling Explosive Growth Netflix's Live Mike Tyson Vs. Jake Paul Fight Battling Sound & Streaming Glitches In Lead-Up To Main Event Biden Asked Microsoft to "Raise the Bar on Cybersecurity." He May Have Helped Create an Illegal Monopoly. CFPB looks to place Google under federal supervision, setting up clash Apple's Tim Cook Has Ways to Cope With the Looming Trump Tariffs Apple Removes Another RFE/RL App At Request Of Russian Regulator Here's Why I Decided To Buy 'InfoWars' Elon Musk's X Corp. files notice in Alex Jones' Infowars bankruptcy case Spotify's Plans For AI Generated Music, Podcasts, and Recommendations, According To Its Co-President, CTO, and CPO Gustav Söderström This 'AI Granny' Bores Scammers to Tears Congress ponders underwater alien civilizations, human hybrids, and other unexplained stuff In Memoriam: Thomas E. Kurtz, 1928–2024 Host: Leo Laporte Guests: Alex Kantrowitz, Daniel Rubino, and Iain Thomson Download or subscribe to This Week in Tech at https://twit.tv/shows/this-week-in-tech Get episodes ad-free with Club TWiT at https://twit.tv/clubtwit
  1. TWiT 1006: Underwater Alien Civilizations – Bluesky Growth, Tyson Vs. Paul, AI Granny
  2. TWiT 1005: $125,000 in Baguettes – iPod Turns 23, The $1.1M AI Painting, Roblox
  3. TWiT 1004: Embrace Uncertainty – Political Texts, Daylight Saving Time, Digital Ad Market
  4. TWiT 1003: CrabStrike – Delta Sues Crowdstrike, Hospital AI, Surge Pricing
  5. TWiT 1002: Maximum Iceland Scenario – Data Caps, 3rd Party Android Stores, Nuclear Amazon