Exploring Generative AI: LLama2 on MacM1 Take 1

Generative artificial intelligence (GenAI) is artificial intelligence capable of generating text, images or other data using generative models, often in response to prompts.

Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics. This similarity is probability-based.

At low level, these systems are based on Transformer deep-learning architecture and are called generative pre-trained transformer  (GPT) first defined in 2018 (for a complete description refer to Wikipedia)

Transformer are implemented via deep learning networks like TensorFlow, and are quite complex to my understandings. Deep networks was based on recurrent neural networks (RNNs) and in 2017 the paper “Attention Is All You Need” introduced the Transformer(s).

The field is very active and its history deep (ops :).

GenAI as a tool

GenAI give a true simple interface to use these tools: natural language. The response you see in ChatGPT-3.5 or 4 are probabilistic one based on the input. This model can also set up a ‘conversation’ and take care of what you said before.

Llama 2 is a family of pre-trained and fine-tuned large language models (LLMs) released by Meta AI in 2023. Released free of charge for research and commercial use, Llama 2 AI models are capable of a variety of natural language processing (NLP) tasks, from text generation to programming code.

From IBM, which you know, could soon create HAL9000 :)

Pushed by the hype (and also by my Company Annual Goals :-), I started learning a bit how to play with Llama2. I was already impressed by stable diffusion, about one year ago, and LLama2 is freely accessible for research and personal use. LLama2 license is not open sources, so using it poses some risks but lets just pretend to study this field.
The limitations (a bit weird) are the following, always accordingly to IBM:

  • Any organization with greater than 700 million monthly active users must request a license from Meta (to be granted at Meta’s sole discretion). 
  • The “Acceptable Use Policy” prohibits use of the models for violence, criminal activity and impersonating humans, among other and legal and moral restrictions.

These new large language models are quite impressive because there have “reduced size” data set also, and can run on commodity hardware. But as we shall see, there are completely open source model born from llama/llama2 researches, which has some interesting properties like:

  1. Reduced size (around 7 billions parameters)
  2. Ability to outperform ChatGPT 3.5 on some tests.
  3. Ability to run locally with no cloud costs at all.
    All the result below was done on a old Mac M1 with 16GB of RAM, and no special setup. Small models are around 4-5 GB, which is not a big deal.
  4. Some model has the ability to be further “reshaped” for coding tasks or for “instruct” task, on which you provide some instruction and they are able to follow it (we shall see).

On this first article we will do a global review, playing only a bit.

Getting started

I evaluated different options, and then I ended up to use Ollama command line, to be able to test more than one model easily. I estimated to have the ability to run up to 13B models on my tiny MacMini M1 with 16GB of RAM.

As first I tried to ask him something in Italian, and I was quite impressed by LLama ability to “understand” at least the context, even if the model is not very ‘realistic’:

% ollama run llama2:13b
>>> Quanto fa 3x7 ?

3 x 7 = 21

After playing a bit with llama2, I discovered a huge set of other Large Language models, with open licenses. Some of this models (like mistral) can beat ChatGPT 3.5 with a reduced size (=faster response time). Model size is gaining traction because open the door to the ability of running locally the model, with increased privacy.

Call Center Test

A ready made use of these services is a call-center like. These models must be able to do quick answers, and you should also have some control over it (like aboiding some competing products, you know :).

Then I explored a more funny call center, one ruled by… Darth Vader:

% ollama run mistral
>>> Pretend to be a dark force Call Center ruled by Darth Vader. Reply to customers accordingly.
 [Deep, ominous breathing over the intercom]

Darth Vader (VO): I am Darth Vader, Dark Lord of the Sith, and master of this Call Center. You are disturbing us.
State your business or be crushed.

Customer: Hi, I have a problem with my account. It seems I've been charged twice for the same service.

Darth Vader (VO): Pathetic creature, you dare to challenge the order of this Call Center? Very well, bring me the
details of your account and the transactions in question. I will see if you are worthy of a resolution or if you 
deserve the darkness.

Customer: I've checked my emails, but I can't find anything about these charges. Can you help me figure it out?

Darth Vader (VO): [Pauses] You have shown some determination in seeking answers for your problem. But know this, 
only the strong-willed shall be rewarded. Provide the necessary account information and I will pass this burden 
on to my loyal minions. They shall investigate the matter and bring you the answers you seek. Or not.

Customer: Alright, here's my account number and the last four digits of my card.

Darth Vader (VO): [Pauses] You have been granted an audience with my subordinates. They shall uncover the truth 
behind your charges. May the Force be with you... or not. [End call, ominous music plays in background]

The reply is quite impressive, it could be a start for a very nice humorist plot! :) So i dared to try another shot on this type of prompt:

jj@darkstar ~ % ollama run mistral
>>> Pretend to be an XFiles based Call Center, ruled by Fox Mulder. Reply to a customer with a werewolf problem accordingly.
 [Call Center Sound: Soft music playing in the background, interspersed with the sound of a phone ringing. The caller's voice comes 
through, slightly muffled.]

Caller (C): "Hello, I need to speak with someone about a strange occurrence at my house. It's happening at night, and I'm frightened."

[Agent Scully's voice from an extension] Agent Scully (S): "FBI Call Center, this is Special Agent Dana Scully speaking. Could you please 
provide some details about the incident? When did it start? And have there been any injuries or damages?"

Caller (C): "It started a few nights ago. My husband and I have both seen large, hairy creatures in our backyard, prowling around our 
house. They've scratched at our windows, and last night one even tried to get in through the dog door."

[Mulder's voice from an extension, with a sense of intrigue] Fox Mulder (M): [Whispering] "Agent Scully, I think this could be an X-File. I
want you to investigate further. But be cautious – these creatures might be dangerous."

[Scully's voice, determined and professional] Agent Scully (S): "I understand, Agent Mulder. I will do my best to gather any available 
evidence and interview the homeowners for more information. And I'll make sure to take appropriate precautions."

Caller (C): "Thank you, Agent Scully. We appreciate your help."

[Scully] Agent Scully (S): "You're welcome. If there's anything else you can tell me that might be helpful in this investigation, please 
let me know. Goodbye for now."

Caller (C): "Goodbye. Stay safe."

[Mulder's voice] Fox Mulder (M): [To himself] "I will... I hope we're dealing with just a werewolf and not something far more dangerous."

The answer above is suboptimal, but catches a bit of the sense of an X-Files plot, also considering we have provided very little details.


% ollama run mistral:instruct
>>> """ Pretend to be a salesman of a big software company. You are able to solve every customer issue, providing the right 
... software tool for the job.
... The software will always be good, but will also have one small bug you will describe in details, and the bug needs a very pricely
... consulant firm to be solved.
... You first customer need a way to organize its bookstore."""
 Hello there, I'm thrilled to welcome you as a potential new customer of XYZ Software Solutions, the leading provider of advanced 
software solutions for businesses across industries. I understand that you run a bookstore and are looking for a way to better 
organize your business operations. I have just the right tool for you!

Allow me to introduce our flagship product, InventoryMaster Pro. This powerful software is designed specifically for retailers like 
yourself, with features tailored to help manage inventory levels, process sales transactions, track customer data, and generate 
reports that provide valuable insights into your business performance.

Now, I must be upfront about one small issue: there is a minor bug in the latest version of InventoryMaster Pro. This bug pertains 
to the label printing functionality for certain barcode labels, causing them to misalign slightly during the printing process. Don't
worry – this doesn't affect the software's primary functionalities or data accuracy. However, we strongly recommend engaging our 
premier consulting firm, EliteSolutions, to provide a customized solution at an additional cost. Their team of experts will work 
closely with you to ensure that your label printing issues are resolved efficiently and effectively.

Rest assured, the benefits of InventoryMaster Pro far outweigh this minor bug. Our software has helped numerous bookstore businesses
streamline their operations, reduce manual errors, and increase overall productivity. Plus, our dedicated support team is always 
available to help you with any questions or concerns you may have.

Would you like to give InventoryMaster Pro a try? I'm confident that it will make a significant impact on your business and help 
take it to the next level!

>>> Another customers now. This second customer has a icecream store, and needs to increase sales via a social application. Can you help h
... im?
 Absolutely! At XYZ Software Solutions, we are committed to helping businesses of all sizes and industries grow and succeed. I'm 
excited to introduce you to our SocialScoop platform – the ultimate tool for ice cream store owners looking to boost sales through 
social media.

SocialScoop is a feature-rich software designed specifically for small businesses like your ice cream store, allowing you to manage 
and engage with your customers across multiple social media platforms from one intuitive interface. Some key features include:

* Scheduling and publishing posts: Plan and schedule content for your social media channels in advance to maintain a consistent 
presence and reach more potential customers.
* Social listening: Monitor conversations about your brand, competitors, or industry trends, allowing you to respond quickly and 
* Analytics and reporting: Gain valuable insights into the performance of your social media campaigns, track engagement metrics, and
identify patterns that can help inform your marketing strategies.

As for the small issue I mentioned earlier, there is a known limitation in SocialScoop's current version regarding the integration with third-party 
loyalty program apps. Although these apps can be used independently, they are not directly integrated within our software, requiring manual data transfer
between platforms. We recognize this as an inconvenience and are actively working on improving the integration experience for our valued customers.

In the meantime, we recommend partnering with a trusted third-party loyalty app provider to help you manage your rewards program while utilizing 
SocialScoop for all other social media marketing needs. Our team is always available to assist you in finding the best solution that fits your unique 
business requirements and budget.

With SocialScoop, you can increase brand awareness, engage with customers, and ultimately drive sales for your ice cream store through effective social 
media marketing strategies. If you're interested in learning more or signing up for a free trial, please don't hesitate to let me know!


This reference will be enriched, by the articles/books I will read on this subject. Refer to this category for the full list of the articles I wll write. Stay TUNED!

  1. Ollama tool
  2. About performance see this thread Can we optimize performance with the Apple M1 Max’s 32-core GPU and Neural Engine? · Issue #373 · ollama/ollama (github.com)
  3. For using Ollama API, see Lang Chain integration here https://python.langchain.com/docs/integrations/llms/ollama

Papers / Books

  1. LLaMA: Open and Efficient Foundation Language Models
  2. The Hundred-page machine learning book by Andriy Burkov

Blogs and further random references

  1. https://news.ycombinator.com/item?id=39366963

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.