Technology Tales

Notes drawn from experiences in consumer and enterprise technology

AI & Data Science Jottings

23:04, 22nd November 2024

Introduction to Meta AI’s LLaMa

Meta AI's LLaMA models represent a significant advancement in open-source artificial intelligence, offering a range of foundation models that demonstrate competitive performance against leading closed-source systems. These models are trained on extensive publicly available data, enabling them to achieve state-of-the-art results with minimal computational resources. They support multilingual capabilities, though performance on non-English languages may be comparatively lower due to the predominance of English text in their training data.

While LLaMA models excel in general tasks and instruction-following, they face limitations in mathematical reasoning and domain-specific knowledge, which researchers are actively addressing through fine-tuning and other techniques. The models are primarily intended for research purposes under non-commercial licences, with a focus on evaluating and mitigating risks such as biases, hallucinations and the generation of harmful content.

Subsequent iterations like LLaMA 2 and 3 have introduced improvements in context length and applicability, though challenges remain in ensuring robustness across diverse use cases. The release of these models has spurred innovation in the open-source community, fostering collaboration to enhance their reliability and expand their potential applications in fields such as data science, natural language processing and beyond.

15:43, 25th October 2024

Build and Deploy RAG-as-a-service

Unwind AI walks developers through building a production-ready retrieval-augmented generation service using Claude 3.5 Sonnet and Ragie.ai, achievable in fewer than 50 lines of Python code. Unlike conventional RAG applications, a managed service approach abstracts the more complex elements of data ingestion, chunking and vector retrieval through APIs, reducing infrastructure overhead and allowing developers to focus on building features.

Ragie.ai handles the full pipeline, from document chunking to hybrid keyword and semantic searches, and offers connectors for services such as Google Drive, Notion and Confluence. The tutorial guides readers through setting up a development environment, building a RAGPipeline class that manages authentication and API endpoints, and creating a Streamlit interface through which users can upload documents via URL, select a processing mode and submit queries that are answered using information retrieved from the uploaded material. The system retrieves relevant sections from a document and passes them to Claude 3.5 Sonnet, which synthesises a response, with the whole application launchable locally using a single terminal command.

15:41, 25th October 2024

IBM Granite 3.0: open, state-of-the-art enterprise models

The release of IBM's Granite 3.0 models introduces a range of advancements in artificial intelligence, focusing on efficiency, safety and scalability. These models include mixture of experts (MoE) variants, such as the 3B-A800M and 1B-A400M, which balance performance with low-latency inference, making them suitable for on-device and server applications.

A speculative decoding technique, applied to the 8B Instruct model, achieves a 220% increase in tokens per step, enhancing inference speed without compromising accuracy. The suite also features Granite Guardian models, designed to detect and mitigate risks such as hallucinations, bias and harmful content, outperforming existing solutions in benchmark tests.

Available through platforms like Hugging Face, Ollama and IBM watsonx.ai, the models support diverse use cases, from agentic workflows to retrieval-augmented generation. Resources such as tutorials, quantisation guides and integration frameworks are provided to assist developers in deploying and optimising these tools for enterprise applications.

20:35, 2nd October 2024

Using Llama 3.2 Locally

Meta's Llama 3.2 models are available in two main variants, lightweight models suited to multilingual generation and tool calling, and vision models capable of image reasoning by processing images alongside prompts. Both can be run locally using Msty, a free desktop chatbot application that supports open-source models downloaded directly to a user's machine as well as remote models accessed via API keys.

To run the lightweight Llama 3.2 3B Instruct model locally, users download it from Hugging Face through Msty's model management interface in GGUF format, after which it can be used without an internet connection for tasks such as code generation and debugging. The vision variant, which currently lacks a GGUF release, is instead accessed remotely through the Groq API by creating a GroqCloud account, generating an API key and configuring it within Msty's remote provider settings, allowing users to submit images with prompts and receive detailed descriptive responses at considerable speed.

20:35, 2nd October 2024

 5 LLM Tools I Can’t Live Without

Matthew Mayo, Managing Editor at KDnuggets, outlines five large language model tools he considers essential to his current workflow. LlamaIndex is a framework built for data-centric applications, particularly retrieval augmented generation systems, offering seamless integration with over 40 vector stores and data sources. Ollama enables users to run a variety of language models locally on their own hardware with minimal effort, and also supports serving models to external applications via a Python API. Ollama UI is a no-configuration chat interface, available as a Chrome extension, that provides a straightforward way to interact with locally hosted models. NotebookLM is a Google AI tool that lets users create notebooks drawing on uploaded reference material to produce summaries, FAQs, study guides and even podcast-style audio overviews. Finally, ControlFlow is a Python framework for building agentic AI workflows by defining discrete tasks, assigning specialised agents and combining them into structured flows, with a syntax designed for rapid prototyping.

22:20, 26th September 2024

ScraperAPI is a web scraping platform designed to help businesses and developers collect data from public pages at scale, without the complexity of managing proxies, headless browsers or CAPTCHA handling. The service offers a plug-and-play API, asynchronous request handling and a no-code data pipeline tool, allowing teams to gather large volumes of information efficiently. It also provides structured data endpoints for popular platforms including Amazon, Google and Walmart, returning clean JSON or CSV output rather than raw HTML. A geotargeting feature grants access to a pool of over 40 million proxies across more than 50 countries, helping users avoid blocks. The platform is positioned as a cost-effective alternative to building in-house scraping infrastructure, with compliance with both CCPA and GDPR and a reported track record of serving over 11 billion requests within a 30-day period.

15:21, 14th March 2024

7 GPTs to Help Improve Your Data Science Workflow

The GPT Store, which features more than three million custom models built by developers, offers several tools that can meaningfully enhance a data science workflow. The Data Analyst GPT, created by the ChatGPT team, can process uploaded data files and perform tasks such as correlation analysis while also generating reusable code. The Machine Learning GPT by Maryam Eskandari and the Machine Learning Engineer GPT by Hustle Playground both serve as assistants for building and comparing predictive models, with the latter placing particular emphasis on deployment and production structuring.

For coding support, AutoExpert by llmimagineers.com functions as a pair programming assistant with code generation capabilities and session state management. ScholarGPT by awesomegpts.ai helps practitioners locate relevant academic research papers based on a given use case or problem. Whimsical Diagrams by whimsical.com enables the creation of flowcharts, mind maps and sequence diagrams to visualise workflows and concepts, while the Canva GPT helps practitioners present their findings through professionally designed layouts and slide formats. Together, these tools cover a broad range of data science activities, from technical modelling and research to communication and visual presentation.

15:20, 14th March 2024

Data Science and the Go Programming Language

Go is a programming language developed by Google engineers Robert Griesemer, Rob Pike and Ken Thompson, introduced in 2009 as a systems programming language that retains the performance advantages of C while being simpler and safer to use. Unlike Python, which predates multicore processors and is single-threaded and interpreted, Go is designed for concurrent processing and compiles directly to machine code, making it significantly faster and better suited to the computational demands of modern data science.

Northwestern University's Master of Science in Data Science programme has adopted a trilingual approach, using R as the primary language for analytics and modelling, Python for artificial intelligence and natural language processing, and Go for data engineering, recognising that each language has distinct strengths. Go is already widely used in industry by major organisations including Google, Netflix, Uber and Dropbox, and underpins many prominent computing tools and platforms. It is considered easier to maintain than Python, offers automated memory management, enforces a consistent coding style and has a strong commitment to backward compatibility, all of which make it well suited to building scalable, high-performance data science systems including web applications, database servers and distributed computing environments.

16:04, 22nd January 2024

Tugan.ai is an AI-powered content generation tool that allows users to produce original written content by simply submitting a URL, removing the need for manual prompts or copywriting expertise. The platform can transform existing online content, including videos and web pages, into a range of formats such as email sequences, social media posts, newsletters, LinkedIn articles and advertising copy. Unlike general-purpose AI tools, it is built on accumulated copywriting knowledge, intended to produce results that read as human and engaging rather than mechanical.

22:33, 8th January 2024

Microsoft Clarity is an analytics tool that uses artificial intelligence to provide insights into user interactions with websites and applications. It offers features such as session recordings to observe user behaviour, heatmaps to visualise engagement patterns and AI-generated summaries to highlight key trends and areas for improvement. The tool also includes AI chat functionality for querying data and brand agents to assist users directly on sites.

Designed for businesses of all sises, it supports mobile app analytics through lightweight integrations and works across multiple platforms, including Android, iOS and web-based applications. Users have reported improved decision-making through data-driven observations, with features helping to identify usability issues, track conversion rates and refine strategies based on real-time user feedback. The service is available free of charge, with no restrictions on traffic or usage, and is compatible with privacy regulations such as GDPR and CCPA.

  • The content, images, and materials on this website are protected by copyright law and may not be reproduced, distributed, transmitted, displayed, or published in any form without the prior written permission of the copyright holder. All trademarks, logos, and brand names mentioned on this website are the property of their respective owners. Unauthorised use or duplication of these materials may violate copyright, trademark and other applicable laws, and could result in criminal or civil penalties.

  • All comments on this website are moderated and should contribute meaningfully to the discussion. We welcome diverse viewpoints expressed respectfully, but reserve the right to remove any comments containing hate speech, profanity, personal attacks, spam, promotional content or other inappropriate material without notice. Please note that comment moderation may take up to 24 hours, and that repeatedly violating these guidelines may result in being banned from future participation.

  • By submitting a comment, you grant us the right to publish and edit it as needed, whilst retaining your ownership of the content. Your email address will never be published or shared, though it is required for moderation purposes.