LMM Large Multimodal Models: Beyond Text And Images

Multimodal AI

Digital assistants can learn more about you and the environment around you by utilizing multimodal AI, which gains even more power when it can operate on your device and processes various inputs such as text, photos, and video.

Large Multimodal Models(LMM)

Even with its infinite intelligence, generative artificial intelligence (AI) can only do so much because of how well it perceives its environment. Large multimodal models (LMMs) are able to examine text, photos, videos, radio frequency data, and even voice searches in order to offer more precise and pertinent responses.

It’s an important step in the development of generative AI after the widely used Large Language Models (LLMs), which included the ChatGPT initial model, which could only process text. Your PC, smartphone, and productivity apps will all benefit greatly from this improved ability to comprehend what you see and hear. Digital assistants and productivity tools will also become much more helpful. And the procedure will be quicker, more private, and power-efficient if the device can manage these processes.

LLaVA: Large Language and Vision Assistant

Qualcomm Technologies is dedicated on making multimodal AI available on devices. Large Language and Vision Assistant (LLaVA), a community-driven LMM with over seven billion parameters, was initially demonstrated by us back in February on an Android phone powered by the Snapdragon 8 Gen 3 Mobile Platform. In this demonstration, the phone could “recognize” images, such as a dish of fruits and vegetables or a dog in an open environment, and carry on a conversation with them. One may ask to have a recipe made with the things on the platter, or they could ask for an estimate of how many calories the recipe will include overall. Take a look at it:

The AI of the future is multimodal

Multimodal AI 2024

Given the increased clamor surrounding multimodal, this work is crucial. Microsoft unveiled the Phi-3.5 family of devices last week, which offers visual and multilingual support. This came after Google touted LMMs during its Made by Google event, wherein the multimodal input model Gemini Nano was unveiled. GPT-4 Omni, an original multimodal model from OpenAI, was unveiled in May. This comes after comparable research from Meta and community-developed models like LLaVA.

When combined, these developments show the direction that artificial intelligence is taking. It goes beyond simply having you type questions at a prompt. Qualcomm’s goal is to make these AI experiences available on billions of phones worldwide.

Qualcomm Technologies is collaborating with Google to enable the next generation of Gemini on Snapdragon, and it is working with a wide range of firms that are producing LMMs and LLMs, such as Meta’s Llama series. With the help of their partners, these models operate seamlessly on Snapdragon, and they can’t wait to surprise customers with more on-device AI features this year and the next.

While an Android phone is a great place to start when utilizing multimodal inputs, other categories will soon reap the benefits as well. For example, smart glasses that can scan your food and provide nutritional information, or cars that can comprehend your voice commands and help you while driving, are just a few examples of how multimodal inputs will benefit you.

Numerous difficult jobs can be completed via multimodal AI

These are just the beginning for multimodal AI, which may use a mix of cameras, microphones, and vehicle sensors to identify disinterested passengers in the back of an automobile and provide entertaining activities to pass the time. Additionally, it might make it possible for smart glasses to identify exercise equipment at a health club and generate a personalized training schedule for you.

The precision facilitated by multimodal AI will be important in aiding a field technician to diagnose problems with your household appliances or in guiding a farmer to pinpoint the root cause of crop-related problems.

The concept is that by utilizing cameras, microphones, and other sensors, these devices which start with phones, PCs, automobiles, and smart glasses can enable the AI assistant to “see” and “hear” in order to provide more insightful contextual responses.

The significance of on a device

Your phone or car must have sufficient processing capacity to handle those requests in order for all those added capabilities to function optimally. Since the battery on your phone must last the entire day, trillions of operations must occur quickly and effectively when using it. By using the device, you can avoid waiting for servers to react when they are too busy to ping the cloud. They’re also more private because you keep your device and the answers you receive with you.

That has been Qualcomm Technologies’ top concern. Handsets can handle a lot of processing on the phone itself because to the Snapdragon 8 Gen 3 processor’s Hexagon NPU. Likewise, the Snapdragon X Elite and Snapdragon X Plus Platforms enable more than 20 Copilot+ PCs on the market today to manage complex AI functions on the device.

Valkey 7.2 On Memorystore: Open-Source Key-Value Service

The 100% open-source key-value service Memorystore for Valkey is launched by Google Cloud.

In order to give users a high-performance, genuinely open-source key-value service, the Memorystore team is happy to announce the preview launch of Valkey 7.2 support for Memorystore.

Memorystore for Valkey

A completely managed Valkey Cluster service for Google Cloud is called Memorystore for Valkey. By utilizing the highly scalable, reliable, and secure Valkey service, Google Cloud applications may achieve exceptional performance without having to worry about handling intricate Valkey deployments.

In order to guarantee high availability, Memorystore for Valkey distributes (or “shards”) your data among the primary nodes and duplicates it among the optional replica nodes. Because Valkey performance is greater on many smaller nodes rather than fewer bigger nodes, the horizontally scalable architecture outperforms the vertically scalable architecture in terms of performance.

Memorystore for Valkey is a game-changer for enterprises looking for high-performance data management solutions reliant on 100% open source software. It was added to the Memorystore portfolio in response to customer demand, along with Memorystore for Redis Cluster and Memorystore for Redis. From the console or gcloud, users can now quickly and simply construct a fully-managed Valkey Cluster, which they can then scale up or down to suit the demands of their workloads.

Thanks to its outstanding performance, scalability, and flexibility, Valkey has quickly gained popularity as an open-source key-value datastore. Valkey 7.2 provides Google Cloud users with a genuinely open source solution via the Linux Foundation. It is fully compatible with Redis 7.2 and the most widely used Redis clients, including Jedis, redis-py, node-redis, and go-redis.

Valkey is already being used by customers to replace their key-value software, and it is being used for common use cases such as caching, session management, real-time analytics, and many more.

Customers may enjoy a nearly comparable (and code-compatible) Valkey Cluster experience with Memorystore for Valkey, which launches with all the GA capabilities of Memorystore for Redis Cluster. Similar to Memorystore for Redis Cluster, Memorystore for Valkey provides RDB and AOF persistence, zero-downtime scaling in and out, single- or multi-zone clusters, instantaneous integrations with Google Cloud, extremely low and dependable performance, and much more. Instances up to 14.5 TB are also available.

Memorystore for Valkey, Memorystore for Redis Cluster, and Memorystore for Redis have an exciting roadmap of features and capabilities.

The momentum of Valkey

Just days after Redis Inc. withdrew the Redis open-source license, the open-source community launched Valkey in collaboration with the Linux Foundation in March 2024 (1, 2, 3). Since then, they have had the pleasure of working with developers and businesses worldwide to propel Valkey into the forefront of key-value data stores and establish it as a premier open source software (OSS) project. Google Cloud is excited to participate in this community launch with partners and industry experts like Snap, Ericsson, AWS, Verizon, Alibaba Cloud, Aiven, Chainguard, Heroku, Huawei, Oracle, Percona, Ampere, AlmaLinux OS Foundation, DigitalOcean, Broadcom, Memurai, Instaclustr from NetApp, and numerous others. They fervently support open source software.

The Valkey community has grown into a thriving group committed to developing Valkey the greatest open source key-value service available thanks to the support of thousands of enthusiastic developers and the former core OSS Redis maintainers who were not hired by Redis Inc.

With more than 100 million unique active users each month, Mercado Libre is the biggest finance, logistics, and e-commerce company in Latin America. Diego Delgado discusses Valkey with Mercado Libre as a Software Senior Expert:

At Mercado Libre, Google Cloud need to handle billions of requests per minute with minimal latency, which makes caching solutions essential. Google Cloud especially thrilled about the cutting-edge possibilities that Valkey offers. They have excited to investigate its fresh features and add to this open-source endeavor.”

The finest is still to come

By releasing Memorystore for Valkey 7.2, Memorystore offers more than only Redis Cluster, Redis, and Memcached. And Google Cloud is even more eager about Valkey 8.0’s revolutionary features. Major improvements in five important areas performance, reliability, replication, observability, and efficiency were introduced by the community in the first release candidate of Valkey 8.0. With a single click or command, users will be able to accept Valkey 7.2 and later upgrade to Valkey 8.0. Additionally, Valkey 8.0 is compatible with Redis 7.2, exactly like Valkey 7.2 was, guaranteeing a seamless transition for users.

The performance improvements in Valkey 8.0 are possibly the most intriguing ones. Asynchronous I/O threading allows commands to be processed in parallel, which can lead to multi-core nodes working at a rate that is more than twice as fast as Redis 7.2. From a reliability perspective, a number of improvements provided by Google, such as replicating slot migration states, guaranteeing automatic failover for empty shards, and ensuring slot state recovery is handled, significantly increase the dependability of Cluster scaling operations. The anticipation for Valkey 8.0 is already fueling the demand for Valkey 7.2 on Memorystore, with a plethora of further advancements across several dimensions (release notes).

Similar to how Redis previously expanded capability through modules with restricted licensing, the community is also speeding up the development of Valkey’s capabilities through open-source additions that complement and extend Valkey’s functionality. The capabilities covered by recently published RFCs (“Request for Comments”) include vector search for extremely high performance vector similarly search, JSON for native JSON support, and BloomFilters for high performance and space-efficient probabilistic filters.

Former vice president of Gartner and principal analyst of SanjMo Sanjeev Mohan offers his viewpoint:

The advancement of community-led initiatives to offer feature-rich, open-source database substitutes depends on Valkey. Another illustration of Google’s commitment to offering really open and accessible solutions for customers is the introduction of Valkey support in Memorystore. In addition to helping developers looking for flexibility, their contributions to Valkey also support the larger open-source ecosystem.

It seems obvious that Valkey is going to be a game-changer in the high-performance data management area with all of the innovation in Valkey 8.0, as well as the open-source improvements like vector search and JSON, and for client libraries.

Valkey is the secret to an OSS future

Take a look at Memorystore for Valkey right now, and use the UI console or a straightforward gcloud command to establish your first cluster. Benefit from OSS Redis compatibility to simply port over your apps and scale in or out without any downtime.

Cybersec --> Cybersecurity

In September 2010, I began using Tumblr sporadically as a cache of various meaningful quotes on the topic of cybersecurity.

I am moving to a new name. This (cybersecurity.tumblr.com) is the new Tumblr page.

If you want to see all of my past materials, feel free to visit cybersec.tumblr.com.

Enjoy, and feel free to provide any suggestions/comments!

5 notes View Post

technologyinternetadminbrad reblogged this · 9 months ago
govindhtech reblogged this · 9 months ago

technologyinternetadminbrad

TECHNOLOGY INTERNET ADMIN BRAD

ADMINISTRATOR

246 posts

Explore Tumblr Blog

Search Through Tumblr Tags

LMM Large Multimodal Models: Beyond Text And Images