LLMs and the PaLM API

In Artificial Intelligence, google IO, LLMs, Machine Learning, NLP, PaLM by Prabhu Missier

An LLM is a neural network that takes input text and gives a response. LLMs are trained on a huge corpus of information. LLMs can be trained on specific domains but usually they are trained on diverse data sets which contain all sorts of text. LLMs help us prototype AI powered applications faster.
Google’s new suite of AI Developer tools is called the PaLM API which gives access to Google’s latest LLMs all hosted on Google Cloud.

The PaLM API can be accessed through a browser using MakerSuite.
PaLM API is a REST API and it has client libraries for Python, Java, Node and Swift
The API has 3 endpoints : Embeddings, Text, Chat

Text endpoint – used to generate short text responses.
You import the PaLM API, choose a model, supply the prompt and then get the response.
You can finetune the results obtained by supplying something called temperature which controls the degree of randomness the model uses to generate a response.

Chat endpoint – here the model needs to maintain state.
All you do is call the chat function with the initial message, a context and examples.
PaLM has a limit on the chat history it maintains since each time the prompt gets updated with the latest chat messages. Once the limit is reached the chat history is cleared in the LIFO order and that is where the context and examples come in useful

Embeddings endpoint
Are a way to convert words, phrases or an entire para into an array.Similar passages have similar embeddings.
For example let’s say we want to search the TensorFlow webpages for info about the latest version of TensorFlow. We can use the Embeddings API and the Text API in conjunction to get a response.
We start with generating embeddings for various sections in the TensorFlow webpage. We then generate the embedding for our query.
We then compare the embeddings of your query with the embeddings of the webpages to find the relevant section.
We next use the Text API and supply the query along with relevant section to get an answer.
Embeddings can also be used to train a classifier model with less than the usual data that is typically needed. We do this by generating embeddings for the training dataset and the test dataset and then apply the knowledge the model has learnt on your own data. Additional use cases include info retrieval, recommendations and more.

Colab Magics for LLMs
Access LLMs in Colab with a single command.Magics are little shortcuts which make it easy to connect to external tools and one has been added for the PaLM API.
All you need to do is type %%palm followed by a prompt and you get the result in a Pandas dataframe.

Moving to the cloud
Vertex AI’s Generative AI studio gives access to the same family of models as the PaLM API and makes it easy to integrate with the rest of Google Cloud.

References
https://developers.googleblog.com/2023/03/announcing-palm-api-and-makersuite.html