Intro to the world of Generative AI

Siva Gollapalli
4 min readMay 21, 2024

--

credits: https://prolifics.com/us/resource-center/blog/gen-ai-the-basics

For the last few weeks, I have been studying Generative AI and different kinds of models, tools, frameworks, and techniques that are available in the market. The main aim of this blog post is to share all that information and provide an overview of different aspects of Generative AI. So, Let’s dive in:

What is Generative AI?

As per wiki here is the definition Generative artificial intelligence (generative AI, GenAI, or GAI) is artificial intelligence capable of generating text, images, videos, or other data using generative models often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics”.

If you see the definition there is so much jargon that has been thrown which is difficult to understand for you. Let me break it down for you:

What is learning?

The basic foundation of any AI model is the human brain. Just like how the human brain learns things through the five senses, a machine-learning model with learn patterns from input data. The data could be numbers, natural text, audio, video, etc.

What are LLM Models?

Large language models aka LLMs are generative AI models which primarily trained on a large corpus of data. We can interact with these models through natural language aka prompts and get output in a structured format or just plain text. A few popular examples, GPT-4o by OpenAI, llama by Meta, Gemini by Google, Mistral by Mistral.ai . All these models are built on top of transformer architecture.

What is prompt?

It is simple natural language text that the AI model can understand and act according to the instructions given by the user. For example, “Can you explain Generative AI with examples?” is one kind of prompt.

What is embedding?

Unlike humans, computers won’t be able to understand natural text. They can easily understand numbers (which are internally represented in 0’s & 1's). The process of converting natural text, audio, and video into a set of numbers is called embedding. A few embedding models are text-embedding-3-small by OpenAI, mistral-embed by Mistral, etc. In the embedding process, it splits data into a set of characters called tokens and converts them into vectors. All LLMs charge per no. of tokens that have been processed.

What is Fine tuning and transfer learning?

All these above LLMs are general purpose models which means these models can work on Math problems, write poetry, summarize text data, generate code, etc. But when you want to use them for specific problems like identifying SNOMED codes from a given text then these models won’t be able to complete that task accurately. For that, we need to re-train them on specific datasets. There are two ways we can achieve this:

  • Transfer learning: It focuses on transferring knowledge that was acquired by one training dataset and applying the same knowledge to solve other problems by just modifying outer layers and not touching the remaining internal layers.
  • Fine-tuning: Unlike transfer learning it modifies deeper layers of a given model according to newer datasets but base knowledge will remain the same so the model will work accurately for specific tasks.

How to train AI models:

We have few tools available in the market. Here are the top most:

  • Autotrain: It is developed by Hugging Face as a no-code platform to simplify the training process & deployment of LLMs and make it accessible with minimal effort.
  • Unsloth: It is a library written in Python that is more performant and consumes less memory footprint. Along with the library, they provide services as well where you can train AI models and infer them in your applications.
  • Sagemaker: If you prefer hosted platforms like AWS you have Sagemaker available as your service.

Where to Host LLM models?

After the training, we need to host over models just like any other normal application. We can build the API layer on top of the model using any web framework, or use Sagemaker from AWS or we can host on in Hugging Face, etc.

Frameworks to build LLM applications:

  • Langchain: It is the all-in-one framework where we can easily develop applications with various technologies like vector databases, LLM models, relation databases, etc in a few lines of code. The main use cases are text summarisation, chatbots, RAGs, Agents, etc.
  • LangGraph: It helps build stateful multi-actor applications for LLMs. It lets you add checkpoints between cyclic computation chains using Python/JS functions. LangGraph is part of Langchain itself.
  • LlamaIndex: It is similar to Langchain but it is the data framework. If you have different a kind of data and want to use them to build LLM applications then it would be a good choice.
  • phidata: It lets you build assistants on top of different LLMs.
  • Langflow: It is as same as Langchain but instead of coding you have drag & drop UI components and build apps with no code.

Just to make the post concise I didn’t provide any other information about how all these put together and build a useful application. Soon I will share that information as well. I would like everyone to share their knowledge as well as part of comments and let me know what kind of topics you want me to cover next.

I hope I have covered different kinds of tools that are needed for LLM development and there are many more that I wasn’t aware of. I hope this post provides an idea about different aspects in Generative AI space.

Hope this is helpful and any feedback would be welcome.

Welcome to the world of GENERATIVE AI !!!!

--

--