Thursday, October 3, 2024

From Novice to Master: How Robots Learn to Outperform Humans

 We are living in an era where machines can make decisions as complex and nuanced as humans, adapting to new situations with ease. This was reached thanks to neural network policies trained with large-scale reinforcement learning.


dynamics Robots Humanoids

Before we dive into the deep end, let’s refresh our memory on reinforcement learning (RL). AI agent are now capable to explore a new playground (the environment). As it interacts with the swings and slides (states/observations), it makes choices (actions) and learns from the consequences which can be a successful somersault (reward) or the disappointment of a scraped knee (penalty). The AI agent goal? Learn and maximize the fun (cumulative rewards) over time.

A classic example of RL in action is an AI mastering the game of chess. Each move is an action, the board configuration is the state, and winning the game is the ultimate reward.

Robot playing Chess

Now, let’s upgrade our playground explorer with a neural network brain.

policy in AI is essentially a decision-making strategy. When we represent this policy as a neural network, we’re giving our AI agent the ability to learn incredibly complex relationships between what it observes (states) and what it should do (actions).

Let’s say our robot is learning to dance. A simple policy might be “move left foot, then right foot”. But a neural network policy can learn the intricate choreography of a tango, adapting to different music tempos and dance partners.

How this is possible? It’s all about practice through billions of practice sessions.

Let’s go through the process:

  • Starting from Scratch: Our robot begins with random dance moves (random neural network parameters).
  • Practice Makes Perfect: It starts dancing, making decisions based on its current skill level.
  • Learn from Mistakes (and Successes): The outcomes of each dance session are used to tweak the neural network, gradually improving its performance.

But to reach great performances, it’s primordial to train the model using sophisticated techniques, such us:

  • Policy Gradient Methods: Imagine a dance coach giving direct feedback. “More hip swing!” translates to adjusting the policy parameters for better performance. Algorithms like REINFORCE and Proximal Policy Optimization (PPO) fall into this category.
  • Actor-Critic Methods: Picture a dance duo — the actor (policy) performs the moves, while the critic (value function) judges how well they’re done. This teamwork often leads to more graceful learning.
  • Experience Replay: Think of this as watching recordings of past performances. By revisiting these stored experiences, the neural network can learn more efficiently, picking up on subtle details it might have missed the first time around.
Humanoids and Robots dancing using neural network and artificial intelligence

Neural Network policy with RL approach in AI decision-making is giving computer and robots a super power:

  • they can handle complex input, like the myriad of sensory data a self-driving car needs to process
  • they learn end-to-end, and possibly discovering more creative ways humans may not even thought about.

The results are astonishing. This technology has led to AI that can beat world champions at Go (AlphaGo) and humanoids that can easily walk in San Francisco streets.

🎉 Keep learning and discovering AI news. I will be happy if you Follow me and Clap this post. Thanks 😸

Decoding AI: Autoregressive Models and Reinforcement Learning Explained

If you’ve ever been curious about the difference between autoregressive modeling and reinforcement learning, you’re in the right place! Let’s break it down in a way that’s easy to digest.

Autoregressive modeling and reinforcement learning are both key players in the AI world, but they serve different purposes and operate in distinct ways.

Autoregressive modeling is a general modeling approach focusing on generating sequential data, While Reinforcement Learning (RL) is more about a training algorithm.

Autoregressive Modeling focuses on generating sequential data. It is similar to a method for predicting future values based on what has come before. In the realm of machine learning, this often means predicting the next element in a sequence, like forecasting the next word in a sentence based on the words that came before it. For example, in transformers, which are popular AI models, they predict the next token by considering all the previous tokens.

The beauty of autoregressive models is their flexibility — they can be implemented using various architectures, including transformers. However, it’s important to note that while transformers can handle autoregressive tasks, they’re also capable of non-autoregressive tasks.

Take ChatGPT, for instance! It uses an autoregressive approach for text generation, predicting each word one at a time, shaped by the context of all the previous words it has generated.

Now, let’s move to the Reinforcement Learning (RL). This is a different concept. It’s all about how an agent learns to make decisions by interacting with an environment. The agent takes actions based on its current understanding of the situation and receives feedback — either rewards or penalties — from a reward function designed for this purpose. When we mention Reinforcement Learning with Human Feedback (RLHF), we’re talking about incorporating feedback collected from human input to refine the agent’s learning process.

The agent uses this feedback to discover the best strategy for maximizing its reward over time.

So, what set them apart?

  • Learning approach : Reinforcement learning RL involves dynamic environments where the agent learns through trial and error/fail, while autoregressive models learn from static datasets.
  • Output Generation: Autoregressive models generate sequences in a step-by-step manner based on prior data, while RL focuses on maximizing long-term rewards through a series of actions.
  • Feedback Mechanism: Feedback comes from comparing predictions to actual outcomes in the case of autoregression. While in RL, feedback comes from rewards received from the environment based on actions taken.

Sunday, September 22, 2024

NEO The AI Robot that looks like humans- Amazing humanoid by 1X backed by OpenAI- Home Robotics News

 In the rapidly evolving world of artificial intelligence and robotics, a new player has emerged that promises to revolutionize our homes and potentially reshape our workforce. Meet NEO, the latest creation from 1X Technology, an AI humanoid robot 🤖 designed to seamlessly integrate into our daily lives.

Neo: The AI Robots by 1X and OpenAI

Standing at 1.6 meters tall and weighing just 25 kilograms, NEO is perfectly sized for the average home. But don’t let its compact frame fool you — this robot packs a powerful punch when it comes to capabilities.


humans with ai robots for adults and house

NEO’s most striking feature is its natural movement, achieved through a muscle-like design that allows it to navigate home environments with ease. This biomimetic approach sets NEO apart from more mechanical-looking robots, making it a more natural fit in human spaces.

Advanced AI and Embodied Learning

AI robots that looks like humans- humonoid
At the heart of NEO’s capabilities is its advanced artificial intelligence. 1X Technology has pioneered a new branch of innovation called embodied learning, which combines robotics and AI to create machines that can learn and adapt through physical interactions with their environment.

This means that NEO isn’t just a pre-programmed machine — it’s a learning entity that grows smarter every day. As it performs tasks and interacts with its surroundings, NEO’s AI continuously evolves, allowing it to better understand and respond to the unique needs of each household.

Backed by Industry Giants

The potential of NEO hasn’t gone unnoticed in the tech world with a raise of $100 million in Series B round. OpenAI, Tiger Global and EQT Ventures have thrown their support behind 1X Technology. This backing lends significant credibility to the project and suggests that NEO represents a major leap forward in robotics and AI integration.

A Vision for the Future

Bernt, the CEO of 1X Technology, envisions a future where robots like NEO are commonplace in our homes. “We’re building humanoid robots for the home that are safe, capable, and affordable,” he states. “Our goal is to create robots that can live and learn among us.”

But 1X’s ambitions don’t stop at individual homes. The company is designing NEO with large-scale manufacturing in mind, aiming to produce humanoids at a volume high enough to create an abundant supply of labor and solve problems at a societal level.

Implications for Society

While the potential benefits of NEO are exciting, its development also raises important questions about the future of work. As these intelligent robots become more capable and widespread, there’s a possibility they could displace jobs currently held by humans.

Watch the full video to see NEO in action and learn more about this groundbreaking technology.

Are you as intrigued by NEO and the future of home robotics as we are? Your thoughts and engagement can help shape the discussion around this exciting technology:

  • 💡 If you found this article insightful, please consider giving it a ‘clap’ to help others discover it.
  • 💬 What are your thoughts on AI humanoids in the home? Share your perspective in the comments below.
  • 🔗 Know someone interested in AI and robotics? Spread the word by sharing this article with them.
  • 🔔 Want to stay updated on the latest in AI and robotics? Follow me here on Medium for more articles on cutting-edge technology.

Thanks for your action. Your engagement helps drive this important conversation forward!

#AIRobot #FutureOfWork #HomeTech #1XTechnology #EmbodiedAI #OpenAI #RoboRevolution

Tuesday, April 2, 2024

Generated Knowledge Prompting vs Knowledge based-Tuning in the LLM Context

Pretrained Large Language Models LLMs are increasingly used and are known with their astonishing performances. While LLMs is continuously improving, it’s still limited in specific domains like mathematics and medicine. Various research has been dedicated to equipping LLMs with external tools and techniques providing more flexibility and enhancing the capabilities of large language models.

In this post, I’m excited to present Generated Knowledge Prompting (GKP) and Knowledge-based Tuning techniques. They are distinct in their approaches, these methods share the common goal of refining model performance and expanding its comprehension across various domains.

Generated Knowledge Prompting (GKP)

GKP is a prompting technique proposed by Liu et al. The key insight behind Generated Knowledge Prompting is to be able to generate useful knowledge from another language model or external sources, then provide the knowledge as an input prompt that is concatenated with a question.

Generated Knowledge Prompting LLM

The intention behind this technique is to steer the model’s learning process towards specific types of outputs. By crafting prompts that target particular knowledge domains or concepts, GKP effectively guides the model to understand and generate relevant information.

Consider a scenario where a language model needs to grasp medical terminology and concepts. Through generated Knowledge Prompting, the first step is to generate the medical knowledge. The second is to integrate the knowledge as an input in the prompts. The model will analyze the prompt and use its underlying medical knowledge to generate text that addresses the query or provides information on the specified topic. This method harnesses the power of structured prompts to shape the model’s understanding and output generation capabilities.

Knowledge-Based Tuning

On the other hand, knowledge-based tuning involves fine-tuning a pre-trained model by integrating external data or information from knowledge sources. This process seeks to augment the model’s performance on specific tasks or domains by enriching its understanding with additional knowledge.

Let’s take an example of legal texts. If a language model requires proficiency in comprehending legal language, knowledge-based tuning can be employed. By exposing the model to a dataset comprising legal documents, the model can assimilate legal terminology and concepts, thereby enhancing its ability to interpret and generate responses in legal contexts.

Difference Between the Two Techniques

While both Generated Knowledge Prompt GPK and Knowledge-based Tuning contribute to enhancing language models, they differ in their methodologies and objectives. GKP focuses on guiding the model’s learning trajectory through tailored prompts. Knowledge is provided to the LLM as an input, while knowledge-based tuning modifies the model or its training process to incorporate external knowledge directly. This results in a fine-tuned model.

In summary, Generated Knowledge Prompting and knowledge-based tuning represent complementary approaches to empowering language models with deeper comprehension and proficiency across diverse subject matters.

That’s all for today. I will share more knowledge around LLM. So don’t miss coming posts 🎉. Follow and Join the conversation with your experiences/suggestion in the comments below 🌸

Sunday, March 31, 2024

Pip Install OS Error Python package installation System Does not have Windows Long Path Support Enabled

 I have been trying to install the Transformer to work on deep learning project. I used the command, which I used many times before and it works perfectly.

But this time, I got the error Could not install packages due to an OSError: [WinError 2] The system cannot find the file specified

pip install transformers --users


 

ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: ‘C:\\Users\\me\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python312\\site-packages\\transformers\\models\\deprecated\\trajectory_transformer\\convert_trajectory_transformer_original_pytorch_checkpoint_to_pytorch.py’
HINT: This error might have occurred since this system does not have Windows Long Path support enabled. You can find information on how to enable this at https://pip.pypa.io/warnings/enable-long-paths

In order to fix the OSError, we need to make Windows 10 Accept File Paths Over 260 Characters.

So here the steps to follow in order to fix the error and be able to install the transformers package.

Step 1: Open Registry Editor

To do so, use the “Windows + R” and write “regedit”

Open Registry Editor with Shortcut Windows R

Another option is to search for Registry Editor

Step 2:

In the registry editor, use the left sidebar and navigate to “CurrentControlSet\Control\FileSystem”

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem

Step 3:

On the right, find a value named “LongPathEnabled” and double-click it. If you don't see the value listed, you'll need to create it by right-clicking the “FileSystem” key, choosing New > DWORD (32-bit) Value, and then naming the new value “LongPathEnabled”.

Change the value from 0 to 1 and save your changes.

You can now close the registry and restart the computer. For my side, I just restarted my Visual Studio and was able to install successfully the transformer.

🎉Perfect, hope this helps you also and don’t forget to:

  • Like the post 👏
  • Follow me for more Tips

Articles les plus consultés