Statistics

Hugging Chat Statistics

Dive into the world of HuggingChat, the open-source AI chatbot reshaping conversational AI. Explore its features, architecture, and position in the AI landscape.

Welcome to an exploration of HuggingChat, an open-source AI chatbot reshaping the landscape of conversational AI. We embark on this journey to uncover the intricacies of HuggingChat, offering a compelling alternative to ChatGPT.

As we begin our journey, we examine HuggingChat's top highlights, including its outstanding 30 billion parameters, the extensive dataset collected as of April 2023, and the pivotal distinction of not storing chat data. This article provides a detailed overview of HuggingChat's capabilities, and then a closer look at its architecture and special features.

Hugging Face has emerged as the visionary force behind HuggingChat, with a mission to democratize AI. The ethos of this innovative project is based on their dedication to transparency and accessibility, which gives further context to HuggingChat's story of emergence in the ever-changing AI field.

Our investigation goes a step further and pits HuggingChat against its artificial intelligence competitors. Our objective is to present a comprehensive analysis of HuggingChat's capabilities and efficacy by contrasting and comparing it with other top chatbots.

1. Top Highlights of HuggingChat

  1. HuggingChat is the open-source alternative to ChatGPT (Source).
  2. HuggingChat has 30 billion parameters and is at the moment, the best open-source chat model (Source).
  3. HuggingChat was released by Hugging Face, an artificial intelligence company founded in 2016 with the self-proclaimed goal of democratizing AI (Source).
  4. The Hugging Face dataset for HuggingChat contains data collected up to April 12, 2023 (Source).
  5. The HuggingChat dataset is the result of a worldwide crowdsourcing effort by over 13,000 volunteers and includes 161,443 messages distributed across 66,497 conversation trees in 35 different languages, annotated with 461,292 quality ratings (Source).
  6. HuggingChat is a generative AI tool that can create texts like; summaries, essays, letters, emails, and song lyrics (Source).
  7. Accessing HuggingChat is quick and straightforward – just visit HuggingFace.co/Chat and you’re ready to chat (Source).
  8. HuggingChat does not store any chat data (Source).
  9. HuggingChat is prone to hallucinations, a phenomenon that happens when an AI chatbot responds with information that isn't based on reality (Source).

2. Overview of HuggingChat

What is HuggingChat?

  • HuggingChat is a new AI-powered chatbot available for testing on Hugging Face (Source).
  • HuggingChat can perform similar tasks as ChatGPT, including drafting articles, solving coding problems, or answering questions (Source).
  • HuggingChat is the open-source alternative to ChatGPT (Source).

How much Data does HuggingChat have?

HuggingChat has 30 billion parameters and is at the moment the best open-source chat model according to Hugging Face (Source).

HuggingChat has 30 billion parameters with the best open-source chat model

What Model is HuggingChat based on?

  • HuggingChat is currently based on the latest Large Language Model Meta AI (LLaMA) model, a foundational Conversational AI model with 65 billion parameters from Meta, released in late February 2023, developed by the project OpenAssistant (Source).
  • OpenAssistant has the rather ambitious goal of going beyond ChatGPT, they intend to build the assistant of the future, able to not only write email and cover letters, but do meaningful work, use APIs, dynamically research information, and much more, with the ability to be personalized and extended by anyone (Source).
  • Open Assistant itself is a project of the non-profit Large-scale Artificial Intelligence Open Network (LAION); a global non-profit organization behind Stable Diffusion (Source), and dedicated to providing access to cutting edge technology as open source (Source).

Is Data on HuggingChat updated to 2023?

  • HuggingChat was trained with the OpenAssistant Conversations Dataset (OASST1), containing data that was collected up to April 12, 2023. This model uses the same training methodology created by OpenAI that’s called reinforcement learning from human feedback (RLHF) (Source).
  • HuggingChat is an AI chatbot that was trained using a brand-new set of Open Assistant Conversations (OASST1), consisting of 161,443 messages distributed across 66,497 conversation trees, annotated with 461,292 quality ratings (Source), collected until April 12, 2023 (Source).
Hugging Chat Data
  • The HuggingChat dataset is the product of a worldwide crowdsourcing effort by over 13,000 volunteers and covers 96 different languages (Source).

  • According to the Chatbot Arena Conversations Dataset, this dataset includes 33,000 conversations involving 20 different models, with an average of 1.2 turns per sample and an average of 52.3 tokens per prompt.

  • Each sample includes a question ID, two model names, the full conversation text in OpenAI API JSON format, the user vote, an anonymized user ID, the detected language tag, the OpenAI moderation API tag, an additional toxic tag, and a timestamp (Source).

  • The dataset reflects real-world user prompts and includes measures to flag and exclude personally identifiable information (PII) and inappropriate content for a safe data release. (Source).

Do you need to create an Account to use HuggingChat?

  • HuggingChat is available to everyone, and you don't need to register or create a login account to use it (Source).

  • Accessing HuggingChat is quick and straightforward – just visit HuggingFace.co/Chat and you’re ready to chat (Source).

Who created HuggingChat?

  • HuggingChat was released by Hugging Face, an artificial intelligence company founded in 2016 with the self-proclaimed goal of democratizing AI (Source).

  • Hugging Face, the AI startup backed by tens of millions in venture capital, released an open-source alternative to OpenAI’s viral AI-powered chatbot, ChatGPT, dubbed HuggingChat (Source).

  • Though Hugging Face is likely to use different models in HuggingChat in the future, Open Assistant is always looking for volunteers to help train its AI chatbot, for which anyone can sign up for (Source).
  • HuggingFace launched a ChatGPT “clone” called HuggingChat based on Open Assistant, a 30-billion-parameter LLaMa model (Source).

Is the Source Code for HuggingChat Accessible?

  • HuggingChat is an open-source technology, which allows users to access and modify the source code to improve the platform and add new properties (Source).

What can HuggingChat be used for?

  • Just like other AI chatbots, HuggingChat can be used to request information, complete tasks, and for entertainment (Source).
  • For instance, HuggingChat can generate text like summaries, essays, letters, emails, and song lyrics. It can also debug and write code, create Excel formulas, and answer general questions much like ChatGPT (Source).
  • Developers who wish to integrate HuggingChat into their existing software can seamlessly do so using Hugging Face's API (Source).
Hugging Chat Uses

What’s the Quality of HuggingChat’s responses?

  • HuggingChat may become slow to respond or not respond at all at times, depending on how busy the servers are (Source).
  • HuggingChat is constantly improving and getting updated, as it's still very much in training, so higher-quality responses and faster load times are to be expected in the future (Source).
  • As with all generative AI tools, the quality of the response you get will largely depend on how well-written your prompts are (Source).
  • HuggingChat can help users generate text in different styles and formats, perform translations, answer questions, and become a useful productivity tool for various tasks such as coding and writing (Source).

Can HuggingChat be Detected?

  • Originality.AI can detect HuggingChat, 9/10 times. Other AI detectors struggle achieving only at best a 20% detection rate (Source).
  • HuggingChat is expected to improve with new NLP models and AI detectors are expected to improve at detecting Open Assistant powered HuggingChat (Source).
  • Results from a test on the detection of tasks done by HuggingChat, are shown below.

➤ Green = Correctly Identified the Content as AI or Original.

➤ Yellow = Uncertain.

➤ Red = Incorrectly Identified the Content as AI or Original (Source).

Hugging Chat vs Ai Detectors

Does HuggingChat store Chat History?

  • HuggingChat does not store any chat data, use any data for training and there are no user accounts (Source).

How is HuggingChat trained?

  • HuggingChat was trained using a method called reinforcement learning from human feedback (RLHF), involving a lot of questions and answers from over 13,000 volunteers from all over the world, helping teach the AI to understand and follow instructions better (Source).
  • This was a good way to get data in different languages. However, it also means that the dataset might have some biases. Not everyone's opinions are represented equally. For example, they sent out a survey to their Discord channel (in English only) asking their open-source contributors questions related to their demographics (but not ethnicity) (Source).
  • The results of the survey revealed that out of the 226 respondents, 201 were male, 10 were female, five identified as non-binary/other and 10 declined to answer (Source).
Hugging Chat Survey Results By Gender

Is Open-Source Model Accepted?

  • Some researchers have criticized the release of open-source models along the lines of StableLM in the past, arguing that they’re flawed and could be used for malicious purposes like creating phishing emails. But others point out that gatekept commercial models like ChatGPT, many of which have filters and moderation systems in place, have been shown to be imperfect and exploitable, as well (Source).

3. HuggingChat versus Other AI Chatbots

How is HuggingChat similar to Other well-known AI Chatbots?

  • Like ChatGPT, Bing Chat, Bard, and other AI chatbots, HuggingChat is a generative AI tool that can create text like summaries, essays, letters, emails, and song lyrics. It can also debug and write code, create Excel formulas, and answer general questions much the way that ChatGPT can (Source).

Which is better, ChatGPT or HuggingChat?

  • HuggingChat can handle many of the tasks ChatGPT can, like writing code, drafting emails, and composing lyrics (Source).
  • ChatGPT can't give you information that's happened after September 2021, so it can't give you real-time updates. HuggingChat on the other hand has information updated up to April 2023 (Source).
  • Right now, ChatGPT is considered the better choice compared to HuggingChat. ChatGPT has been around for a longer time, which means it has had more time to learn and improve. It also has a wider range of features and abilities that make it very useful (Source). 
  • The HuggingChat dataset is not as large as ChatGPT's dataset (Source).This size difference could potentially make HuggingChat more prone to "hallucinations" (generating information that wasn't in the training data) and providing inaccurate information compared to ChatGPT (Source).
  • When thinking about options, the cost is important. ChatGPT has a free version and a paid one that provides better answers for $20 per month. On the flip side, HuggingChat is completely free for everyone (Source).
  • In terms of writing style, ChatGPT provides organized and clear answers, avoiding taking sides. On the other hand, HuggingChat offers more personalized responses, speaking in a more human-like way. However, there might be times when HuggingChat doesn't grasp the context as effectively (Source).
  • Here are some examples of situations where HuggingChat might struggle with context (Source):
  • HuggingChat may struggle with multi-turn conversations, especially on complex topics, sometimes forgetting details from previous turns.
  • HuggingChat may struggle with language nuances; users noted issues with Spanish formality and German grammar.
  • HuggingChat may struggle with ambiguous prompts; for instance, a user received a complex response when asking a simple question like "How are you?"
  • HuggingChat may produce inaccurate or irrelevant information, as seen when a user asked about the last Formula One championship in South Africa, and the model provided incorrect dates.
  • HuggingChat struggles with generating lengthy code or text, especially when users request the model to write entire games or provide detailed responses.
  • HuggingChat may produce offensive or inappropriate responses, such as fabricating serious and graphic claims about events on specific dates.
  • HuggingChat can’t be expected to have ChatGPT’s level of output, because the service is not at that level yet. The app page lists it as version 0.0, which gives an idea of how young it is at this point (Source).
  • Similar to GPT-3 vs GPT-4, the HuggingChat model struggles with simple reasoning questions that GPT-4 is now capable of answering (Source). Some examples are highlighted below:
  • In discussing how an independent voice assistant could compete with popular alternatives, HuggingChat suggested one strategy, while GPT-4 proposed multiple strategies. This hints at HuggingChat's potential limitation in understanding the complexity of the question (Source).
  • In another instance, when questioned about why Siri cannot be used with Amazon Echo, HuggingChat's response lacked detail compared to GPT-4 and Google Bard. This suggests a potential limitation in HuggingChat's understanding of the full context or specific details of such questions (Source).
  • With HuggingChat being open source, the number of ways it is going to be expanded on is potentially more varied than the controlled variations of ChatGPT (Source).

What’s the difference between HuggingChat and ChatGPT?

  • HuggingChat can be said to be an open-source version of ChatGPT. It's also more unreliable than its better-known rival, at least for the time being (Source).
  • ChatGPT is built on OpenAI's proprietary GPT-3.5 architecture (Source). In contrast, HuggingChat is built on the LLaMA model, developed by the Open Assistant project. The Open Assistant project is a nonprofit organization that created the dataset used to train Stable Diffusion, a text-to-image AI model (Source).
  • HuggingChat is free to use for everyone, while ChatGPT offers a paid subscription plan of $20 monthly, to access its GPT-4 version (Source).
  • HuggingChat has 30 billion parameters and is at the moment the best open-source chat model according to Hugging Face. ChatGPT on the other hand, has 100 billion parameters (Source).
Comparison Between Hugging Chat and Chatgpt-4 Parameters

Does HuggingChat and other AI Chatbots Hallucinate?

  • Hallucinations are possible if the bot has issues distinguishing real data from fake data, or if it's trained on a dataset that has errors (Source).
  • ChatGPT's training has turned it into a somewhat reliable AI chatbot -- with some limitations. Artificial intelligence is imperfect, so AI chatbots run the risk of hallucinations and misinformation (Source).
  • HuggingChat is exceptionally prone to hallucinations, a phenomenon that happens when an AI chatbot responds with information that isn't based on reality (Source).
  • Google Bard is notorious for hallucinating and giving wrong information, and, though it happens to all large language models, ChatGPT is less known for hallucinating (Source).
  • HuggingChat is more prone to both hallucinations and providing inaccurate information than the popular ChatGPT, as its dataset isn't nearly as large as its competitor's. The AI chatbot is aware of these limitations, however, and will often remind the user of them (Source).

Is HuggingChat similar to ChatGPT?

In several aspects, HuggingChat bears a strong resemblance to ChatGPT and can perform with comparable effectiveness and precision. For instance, the user interface of Chat GPT closely mirrors that of HuggingChat (Source).

Jonathan Gillham

Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!

More From The Blog

AI Content Detector & Plagiarism Checker for Marketers and Writers

Use our leading tools to ensure you can hit publish with integrity!