Home | Power BI | Excel | Python | SQL | Generative AI | Visualising Data | Analysing Data
AI tools come in all shapes and sizes. Here is my attempt to classify them and to explain some of the terms. AI Tools are often referred to as models and I will use that term in thi section
Open source models are available publicly without charge. This includes the important weights file and some documentation.
Open source models can usually be download and run locally. For example, here are this most popular models as of February 2025 on Ollama, a popular tool for downloading and running models locally.
Here is the current full list
Open-weight and open-source are not exactly the same. For an open-weight model:
A proprietary model is only available through an access mechanism (usually a web site or API) from the company that built the model. The internal details of how the model works or how it was trained are generally not known.
This is not an exhaustive list. The most popular proprietary models are:
The most popular open-source, or at least open-weight models, are:
This refers to the number of parameters in the model. This typically varies between 1 and 700 billion parameters. The larger models take more time and resources to process a request but will probably give a better response. Often for simple prompts, a smaller model is sufficient. Often, a family of models will be published in a range of sizes. For example the deepseek-r1 model has seven different sizes. The smaller models may have been “distilled” for the large models
Parameters are adjustable knobs of the model set during training that help it to know (in a lossy way) the content of its source corpus (usually a cleaned up crawled copy of the internet.
These are the latest models which first split a complex prompt into a series of tasks and then attend to each task in turn, then put it all together. In this way, they can solve more difficult problems. They may also show their working.
For example, the screenshot below shows Deepseek thinking
Most times we interact with an AI tool through its website or app. This looks like a stripped down user interface but in fact provides many features and capabilities. This is a user interface wrapper that allows access and interaction with a model but does much more than simply a direct go-between passing the prompt to the model and displaying the results. For example
We can choose a model (either cheap, and fast or slower, more expensive but possible providing a better response)
We can decide whether to search the web and provide this information to the model to consider in its response. For example, a user would do this for a question about a recent news event, that occurred after the model’s cut-off date.
There are various modes of operation. For example, ChatGPT’s Canvas mode splits the screen, put the prompt / response on the left and the document being worked on the right. This allows a more collaborative approach between user and AI tool. Within the Canvas mode, ChatGPT models can write code, especially Python, and run it for us.
We can customise the app with system settings. We can explicitly provide instructions how to respond (e,g informally, briefly) to all prompts. We can, if we wish, tells the app some details about ourselves that it will take into account in its responses.
ChatGPT also has a memory feature where it infers and record details about the user based on information provided in the prompt. users can view the facts in the memory, delete if necessary and switch the memory off
The user can also choose the modality - e.g. whether to speak or type prompts and whether to receive response in text or audio
AI tools remember our chat history. ChatGPT has a “Projects” feature where we can organise our chat history into projects, for example based on different topics.
The app gives us access to customised version of the model, specialised to excel at a particular task, e.g. image generator, or tutor.
We can attach a file, such as a PDF or spreadsheet, to your prompt.
Models can be specialised to be
A RAG model answers questions based on the content of a set of documents provided - and only that content, not from general training. This is a useful use-case. When building the model, the documents are important, split into chunks of typically a paragraph. These chunks are put into a vector database. When preparing to respond to a prompt, the RAG model will do a semantic similarity search between the prompt and the chunks, add the 5 (say) most relevant chunks into the prompt template that also tells the AI tool to use just this content
Ollama classifies specialised models as either embedding, visual or tools.
There are a set of benchmarks for models. Different benchmarks measure different capabilities e.g. problem-solving, math, For example, Deepseek published its results on the standard benchmarks here …
Benchmarks should be taken as a general guide - the really important thing is how well a model works for you
Modality refers to the types of input and output, whether text, audio (speech or music), image or video
Models can take in data and respond in several formats (text, audi, images, video). Current examples include:
The apps and websites of many AI tools are offering multi-modal features. For example, ChatGPT 4o transforms text to text, image to text (it can describe an image), text to image, and text to video. OpenAI demos GPT 4o as able to understand facial expressions and respond with a voice expressing emotion.
LLMs work with numbers. Our prompt needs to be converted from text to numbers to be input into the model and the model output needs to be converted from numbers to text. A tokenizer converts text to a sequence (array) at numbers. The tokenizer splits text into an array tokens (roughly word/word fragment). Tokenisation maps each word or word-fragment into a number based on a given mapping. For example the Open AI Chat GPT 4 dictionary has about 100,000 tokens.
This page https://platform.openai.com/tokenizer from OpenAI shows what a tokenizer does.
For example the sentence
How many times does the letter r appear in raspberry, elderberry and strawberry?
is split as shown below (notice that elderberry splits into two tokens)
and the corresponding token IDs are
[5299, 1991, 4238, 2226, 290, 10263, 428, 7680, 306, 134404, 11, 33757, 19772, 326, 101830, 30]
The Tiktoktokenizer page is an interactive page - enter a prompt and it will show the corresponding tokens.
Note that in the gpt-4o model there are special tokens for the start and end of the system and user prompt. These are used in the supervised fine tuning (SFT) training to build the instruct model - an assistant that will respond usefully)