Skip to content

About Falcon Models

Sure, here is a document about Falcon Models:

Falcon Models

Falcon Models is a family of large language models (LLMs) developed by the Technology Innovation Institute (TII) in Abu Dhabi. The models are trained on a massive dataset of text and code, and can be used for a variety of tasks, including

  • Natural language understanding (NLU)
  • Natural language generation (NLG)
  • Machine translation
  • Text summarization
  • Question answering
  • Code generation

The Falcon models are available under the Apache 2.0 license, which means that they can be freely used, modified, and redistributed.

Falcon-40B

The Falcon-40B is the largest model in the Falcon family. It has 40 billion parameters, and is trained on a dataset of 500 billion words. The model is capable of state-of-the-art performance on a variety of NLP tasks.

Falcon-7B

The Falcon-7B is a smaller version of the Falcon-40B. It has 7 billion parameters, and is trained on a dataset of 100 billion words. The model is still capable of achieving strong performance on NLP tasks, but it is more efficient to train and deploy.

Falcon-180B

The Falcon-180B is the newest model in the Falcon family. It has 180 billion parameters, and is trained on a dataset of 2 trillion words. The model is the largest openly available LLM, and it is capable of achieving state-of-the-art performance on a variety of NLP tasks.

Use Cases

The Falcon models can be used for a variety of tasks, including:

  • Natural language understanding (NLU): The Falcon models can be used to understand the meaning of text, such as identifying the entities and relationships in a sentence.
  • Natural language generation (NLG): The Falcon models can be used to generate text, such as writing different kinds of creative content, like poems, code, scripts, musical pieces, email, letters, etc.
  • Machine translation: The Falcon models can be used to translate text from one language to another.
  • Text summarization: The Falcon models can be used to summarize a text document into a shorter, more concise version.
  • Question answering: The Falcon models can be used to answer questions about a text document.
  • Code generation: The Falcon models can be used to generate code, such as Python scripts or Java classes.

Availability

The Falcon models are available through the Hugging Face Hub. The models are also available in the TensorFlow Hub and the PyTorch Hub ( and EasyDeL).

Conclusion

The Falcon models are a powerful family of LLMs that can be used for a variety of tasks. The models are open source and available for free, making them a valuable resource for researchers and developers.

How to Use/Load Them in EasyDeL

import jax
from easydel import AutoEasyDeLModelForCausalLM

model, params = AutoEasyDeLModelForCausalLM.from_pretrained(
    'tiiuae/falcon-7b',
    # other kwargs
)

also keep that in mind that returned config includes .get_partition_rules(fsdp=True)

Use With JaxServer

from easydel.serve import JAXServer, JAXServerConfig
from easydel import AutoEasyDeLModelForCausalLM
from transformers import AutoTokenizer

model, params = AutoEasyDeLModelForCausalLM.from_pretrained(
    'tiiuae/falcon-7b',
    # other kwargs
)


class FalconJaxServer(JAXServer):
    ...
    # You have to Custom this one yourself as you 
    # need read JaxServer Documents inorder to learn how


server = FalconJaxServer.from_parameters(
    params=params,
    model=model,
    config_model=model.config,
    add_params_field=True,
    tokenizer=AutoTokenizer.from_pretrained('tiiuae/falcon-7b'),
    verbose=False,
    do_memory_log=True,
    server_config=JAXServerConfig()
)

server.fire()  # Launch FastAPI functions

shared_urls = server.launch(
    share_chat=True,
    share_inst=True
)

Done 😇 this method can be used for all the Falcon models