{"id":1756,"date":"2023-07-20T05:49:33","date_gmt":"2023-07-20T05:49:33","guid":{"rendered":"https:\/\/www.codecrafttech.com\/resources\/?p=1756"},"modified":"2023-07-24T06:04:20","modified_gmt":"2023-07-24T06:04:20","slug":"bloom-ai-model-the-stepping-stone-for-next-level-intelligence","status":"publish","type":"post","link":"https:\/\/www.codecrafttech.com\/resources\/highlights\/bloom-ai-model-the-stepping-stone-for-next-level-intelligence.html","title":{"rendered":"BLOOM AI Model \u2014 The Stepping Stone For Next-Level Intelligence"},"content":{"rendered":"<div class=\"qk xr ze jv zf\">\n<div class=\"ab cm\">\n<div class=\"ed bg ee ef eg eh\">\n<p data-selectable-paragraph=\"\"><img decoding=\"async\" class=\"alignnone size-full wp-image-1757 lazyload\" data-src=\"https:\/\/www.codecrafttech.com\/resources\/wp-content\/uploads\/2023\/07\/1_JMfyMVGpOIAaEXdv9LxKnQ.webp\" alt=\"\" width=\"1920\" height=\"1080\" data-srcset=\"https:\/\/www.codecrafttech.com\/resources\/wp-content\/uploads\/2023\/07\/1_JMfyMVGpOIAaEXdv9LxKnQ.webp 1920w, https:\/\/www.codecrafttech.com\/resources\/wp-content\/uploads\/2023\/07\/1_JMfyMVGpOIAaEXdv9LxKnQ-768x432.webp 768w, https:\/\/www.codecrafttech.com\/resources\/wp-content\/uploads\/2023\/07\/1_JMfyMVGpOIAaEXdv9LxKnQ-1536x864.webp 1536w\" data-sizes=\"(max-width: 1920px) 100vw, 1920px\" src=\"data:image\/svg+xml;base64,PHN2ZyB3aWR0aD0iMSIgaGVpZ2h0PSIxIiB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciPjwvc3ZnPg==\" style=\"--smush-placeholder-width: 1920px; --smush-placeholder-aspect-ratio: 1920\/1080;\" \/><\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"qk xr ze jv zf\">\n<div class=\"ab cm\">\n<div class=\"ed bg ee ef eg eh\">\n<div class=\"qk xr ze jv zf\">\n<div class=\"ab cm\">\n<div class=\"ed bg ee ef eg eh\">\n<p id=\"bae1\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The emergence of artificial intelligence has created a breakthrough in the world. The BLOOM model is a versatile framework at the technology forefront with advanced capabilities of understanding natural language, machine learning, and problem-solving.<\/p>\n<p id=\"1bb4\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The BLOOM model, \u201cBiologically Localized and\u00a0<mark class=\"ajs ajt ao\">Online\u00a0<\/mark>One-shot Multi-Task Learning,\u201d is a machine learning framework,\u00a0<a class=\"af jt\" href=\"https:\/\/medium.com\/@kiran.phd.0102\/generative-ai-is-it-mere-hype-or-a-portal-to-a-new-future-39701b24f5ae\" rel=\"noopener\">breaking the frontiers in generative AI<\/a>, that blends the power of deep learning algorithms with human-brain inspired notions.<\/p>\n<p id=\"7986\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Developed by more than 1000 AI researchers, BLOOM AI is the largest open-access AI model. It creates an opportunity for small businesses, start-ups, and individuals to leverage the potential of the AI model to create innovative applications.<\/p>\n<blockquote class=\"afj afk afl\">\n<p id=\"f476\" class=\"aep aeq afm vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi qk bj\" data-selectable-paragraph=\"\"><strong class=\"vs im\"><em class=\"zh\">Without further ado, let\u2019s delve deep into the BLOOM AI model and see how it is a stepping stone for the next level of intelligence!<\/em><\/strong><\/p>\n<\/blockquote>\n<h2 id=\"65ac\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\"><strong class=\"al\"><em class=\"age\">Everything you should know about BLOOM AI<\/em><\/strong><\/h2>\n<p id=\"ebab\" class=\"pw-post-body-paragraph aep aeq zh vs b aer agf aet aeu aev agg aex aey pf agh afa afb qt agi afd afe qx agj afg afh afi qk bj\" data-selectable-paragraph=\"\">BLOOM is an open-access multilingual language model with a staggering 176 billion parameters and training data on over 366 billion tokens. The initiatives of Hugging Face\u2019s Big Science team, the Microsoft DeepSpeed Team, the NVIDIA Megatron-LM Team, the IDRIS\/GENCI Team, the PyTorch team, and BigScience\u2019s engineering team were involved in developing the most perfect language model in the world.<\/p>\n<p id=\"603d\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The project was founded by Hugging Face and the French NLP community and soon went on to attract participants from over 70+ countries and experts from 250+ institutions. The two eminent French agencies-CNRS and GENCI, provided a computing grant of a whopping three million for the research and training of the BLOOM Model. The BLOOM Model was trained on the Jean Zay supercomputer at IDRIS\/CNRS in the south of Paris for over 117 days (11 March \u2014 6 July 2022).<\/p>\n<p id=\"af56\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">It is built on the Transformer architecture which comprises an input-embedding layer, 70 transformers blocks, and an output language-modeling layer. The architecture of the BLOOM model is identical to GPT3; however, BLOOM is trained in 46 different languages and 13 programming languages.<\/p>\n<figure class=\"agl agm agn ago agp adq ux uy paragraph-image\">\n<div class=\"aef aeg dj aeh bg aei\" tabindex=\"0\" role=\"button\">\n<div class=\"ux uy agk\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*AHfYY-Nk9U2_bYp0 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*AHfYY-Nk9U2_bYp0 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*AHfYY-Nk9U2_bYp0 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*AHfYY-Nk9U2_bYp0 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*AHfYY-Nk9U2_bYp0 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*AHfYY-Nk9U2_bYp0 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*AHfYY-Nk9U2_bYp0 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" \/><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*AHfYY-Nk9U2_bYp0 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*AHfYY-Nk9U2_bYp0 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*AHfYY-Nk9U2_bYp0 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*AHfYY-Nk9U2_bYp0 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*AHfYY-Nk9U2_bYp0 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*AHfYY-Nk9U2_bYp0 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*AHfYY-Nk9U2_bYp0 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\" \/><img decoding=\"async\" class=\"bg aej aek c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*AHfYY-Nk9U2_bYp0\" alt=\"\" width=\"700\" height=\"394\" \/><\/picture><\/div>\n<\/div>\n<\/figure>\n<h2 id=\"fd9b\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\"><strong class=\"al\">What languages is BLOOM AI trained on?<\/strong><\/h2>\n<p id=\"026a\" class=\"pw-post-body-paragraph aep aeq zh vs b aer agf aet aeu aev agg aex aey pf agh afa afb qt agi afd afe qx agj afg afh afi qk bj\" data-selectable-paragraph=\"\">BLOOM is based on the causal language model. It is trained as a next-token predictor and predicts the succeeding token in a sentence based on the preceding tokens. This attribute enables BLOOM to connect different concepts in a sentence and accurately solve arithmetic, translational, and programming problems. BLOOM\u2019s architecture comprises 70 transformer blocks with each block comprising a self-attention layer and a multi-perceptron layer, with input and post-attention layer norms.<\/p>\n<p id=\"2de1\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Graph-pattern search, full-text search, edit graph data, slicer, and advanced phrases query searches are a few of the capabilities that BLOOM possesses. One of the major advantages of BLOOM is that it is a 16 GB RAM which is sufficient to run a super-powerful language model without the necessity of a GPU.<\/p>\n<h2 id=\"5af8\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\">What are the differentiators between BLOOM AI and ChatGPT?<\/h2>\n<p id=\"0fce\" class=\"pw-post-body-paragraph aep aeq zh vs b aer agf aet aeu aev agg aex aey pf agh afa afb qt agi afd afe qx agj afg afh afi qk bj\" data-selectable-paragraph=\"\">Here are some differentiators that set BLOOM AI apart from other language models:<\/p>\n<ul class=\"\">\n<li id=\"5607\" class=\"aep aeq zh vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">Employed 384 graphics cards of 80 gigabytes each on the Jean Zay 28 PFLOPS supercomputer for training.<\/li>\n<li id=\"d3a6\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">Utilizes 176 billion parameters<\/li>\n<li id=\"dd2b\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">Seventy layers with 112 attention heads for each layer.<\/li>\n<li id=\"b8b6\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">Implements ALiBi positional embeddings \u2014 GeLU activation function<\/li>\n<li id=\"72e8\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">Open-source, anyone can use and access it.<\/li>\n<\/ul>\n<figure class=\"agl agm agn ago agp adq ux uy paragraph-image\">\n<div class=\"aef aeg dj aeh bg aei\" tabindex=\"0\" role=\"button\">\n<div class=\"ux uy agk\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*BfIIRqXBBDjooQJ5 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*BfIIRqXBBDjooQJ5 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*BfIIRqXBBDjooQJ5 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*BfIIRqXBBDjooQJ5 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*BfIIRqXBBDjooQJ5 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*BfIIRqXBBDjooQJ5 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*BfIIRqXBBDjooQJ5 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" \/><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*BfIIRqXBBDjooQJ5 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*BfIIRqXBBDjooQJ5 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*BfIIRqXBBDjooQJ5 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*BfIIRqXBBDjooQJ5 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*BfIIRqXBBDjooQJ5 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*BfIIRqXBBDjooQJ5 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*BfIIRqXBBDjooQJ5 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\" \/><img decoding=\"async\" class=\"bg aej aek c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*BfIIRqXBBDjooQJ5\" alt=\"\" width=\"700\" height=\"394\" \/><\/picture><\/div>\n<\/div>\n<\/figure>\n<h2 id=\"18b0\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\"><strong class=\"al\">Understanding BLOOM AI\u2019s Architecture<\/strong><\/h2>\n<h3 id=\"1fd3\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\"><strong class=\"al\"><em class=\"age\">How does the BLOOM model Work?<\/em><\/strong><\/h3>\n<p id=\"79ac\" class=\"pw-post-body-paragraph aep aeq zh vs b aer agf aet aeu aev agg aex aey pf agh afa afb qt agi afd afe qx agj afg afh afi qk bj\" data-selectable-paragraph=\"\">The architecture of BLOOM is based on the casual-decoder transformer model, which is the standard model used for developing LLMs with above 100B parameters for best performance. However researchers and developers introduced key variations in the standard model to ensure BLOOM outperforms all the language models.<\/p>\n<p id=\"5ef9\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\"><em><strong class=\"vs im\">Here are some innovations that make BLOOM different:<\/strong><\/em><\/p>\n<ul class=\"\">\n<li id=\"b957\" class=\"aep aeq zh vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">\n<h4><strong class=\"vs im\">ALiBi Positional Embedding<\/strong><\/h4>\n<\/li>\n<\/ul>\n<p id=\"b596\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Additional information is added to the embedding layer in the standard architecture model. However, while building BLOOM, the developers implemented ALiBi (Attention with Linear Biases), which utilizes a unique approach by attenuating the attention scores from the distance between the keys and queries. The main motive is to leverage the potential of ALiBi because of its ability to extrapolate the longer sequences. However, to the researchers\u2019 surprise, the ALiBi application enhanced downstream performance and led to a smoother training process. It even outperformed both learning and rotary embeddings.<\/p>\n<ul class=\"\">\n<li id=\"633f\" class=\"aep aeq zh vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">\n<h4><strong class=\"vs im\">Embedding LayerNorm<\/strong><\/h4>\n<\/li>\n<\/ul>\n<p id=\"1090\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The developing team experimented with another additional layer normalization right after the embedding layer during the preliminary experiments on a whopping 104 billion parameters model, significantly improving training stability. The BigScience team decided to train BLOOM with additional layer normalization to avoid training instabilities. Notably, the preliminary experiments were conducted in float16, and the final training was performed on bfloat16. It led to a conclusion that float16 is the cause for training instabilities and bfloat16 doesn\u2019t need an embedding LayerNorm.<\/p>\n<ul class=\"\">\n<li id=\"3363\" class=\"aep aeq zh vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">\n<h4><strong class=\"vs im\"><em class=\"afm\">BLOOM Training Process<\/em><\/strong><\/h4>\n<\/li>\n<\/ul>\n<p id=\"aad0\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The BLOOM Model is trained on the ROOTS corpus, and the training process comprises different stages like data sourcing and processing. The ROOTS corpus consisted of 498 Hugging Face datasets that cover 46 languages and 3 programming languages.<\/p>\n<p id=\"b5c8\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The BLOOM model was trained on Megatron-DeepSpeed 20, a state-of-the-art framework for large-scale distributed training. This dynamic framework comprises of two parts:<\/p>\n<ol class=\"\">\n<li id=\"af6f\" class=\"aep aeq zh vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi agy agr ags bj\" data-selectable-paragraph=\"\">\n<h5><strong class=\"vs im\">Megatron-LM21 \u2014\u00a0<\/strong><\/h5>\n<p>It provides the capability for Transformer execution, tensor parallelism, and data loading primitives.<\/li>\n<li id=\"7a7e\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agy agr ags bj\" data-selectable-paragraph=\"\">\n<h5><strong class=\"vs im\">DeepSpeed 22 \u2014\u00a0<\/strong><\/h5>\n<p>It provides the ZeRO optimizer, model pipelining and distributes the training components on the table.<\/li>\n<\/ol>\n<p id=\"820e\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">This framework developed by the dynamic fusion of Megatron \u2014 LM21 and DeepSpeed 22 offers efficient and effective training with 3D parallelism. It provides the four essential and complementary approaches to distributed deep learning, and they are:<\/p>\n<ol class=\"\">\n<li id=\"edf5\" class=\"aep aeq zh vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi agy agr ags bj\" data-selectable-paragraph=\"\">\n<h4><strong class=\"vs im\">Data Parallelism<\/strong><\/h4>\n<\/li>\n<\/ol>\n<p id=\"21f8\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Data Parallelism creates multiple replicas of the model and places each replica on a different device. The model is fed on each device with a slice or a part of the data. The parallel processing ensures the synchronization of all the model replicas at the end of every training phase.<\/p>\n<h4 id=\"0858\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\"><strong class=\"vs im\">2. Tensor Parallelism<\/strong><\/h4>\n<p id=\"7447\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Tensor parallelism focuses on partitioning individual layers of the model across multiple devices. Instead of having the whole activation or gradient stored on a single GPU, the fragments of the tensor are stored on multiple GPUs, which assists in performing horizontal parallelism and intra-layer model parallelism.<\/p>\n<h4 id=\"4a62\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\"><strong class=\"vs im\">3. Pipe Parallelism<\/strong><\/h4>\n<p id=\"659c\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The pipe parallelism approach splits the model\u2019s layers across different GPU systems to ensure that each GPU system handles a fraction of the model assisting in vertical parallelism.<\/p>\n<h4 id=\"6e63\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\"><strong class=\"vs im\">4. ZeRO Optimizer &#8211;<\/strong><\/h4>\n<p id=\"6463\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Zero or Zero Redundancy Optimizer ensures that different processes utilize only a fraction of data (parameter, gradients, and optimizer states) necessary for training steps. The developers used ZeRO stage 1, where only the optimizer stages were shared.<\/p>\n<p id=\"c20e\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The BLOOM model received training for 117 days and achieved a training throughput of 150 TFLOPS which is currently the highest throughput a language model can achieve with A100 80GB GPUs.<\/p>\n<h2 id=\"3395\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\"><strong class=\"al\"><em class=\"age\">Advantages of the BLOOM AI model:<\/em><\/strong><\/h2>\n<p id=\"ddf3\" class=\"pw-post-body-paragraph aep aeq zh vs b aer agf aet aeu aev agg aex aey pf agh afa afb qt agi afd afe qx agj afg afh afi qk bj\" data-selectable-paragraph=\"\">BLOOM offers many benefits, making it one of the most powerful tools for diverse industry domains. Here are some of its benefits:<\/p>\n<ul class=\"\">\n<li id=\"bc53\" class=\"aep aeq zh vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">The BLOOM model\u2019s ability to swiftly adapt to new tasks, even with minimal training data, is one of its most striking aspects.<\/li>\n<li id=\"883c\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">The BLOOM model prioritizes ethical and fair decision-making to minimize biases and promote transparency and trustworthiness.<\/li>\n<li id=\"a305\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">As new duties develop, more modules may be easily added without interfering with the performance of current modules.<\/li>\n<li id=\"fde4\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">The BLOOM model constantly adjusts its model parameters depending on the most recent data, ensuring it stays in sync with changing data distributions.<\/li>\n<li id=\"3699\" class=\"aep aeq zh vs b aer agt aet aeu aev agu aex aey afn agv afa afb afo agw afd afe afp agx afg afh afi agq agr ags bj\" data-selectable-paragraph=\"\">The capacity of the BLOOM model to learn from sparse data and its complex neural network design contributes to its high accuracy.<\/li>\n<\/ul>\n<h2 id=\"e8c1\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\"><strong class=\"al\"><em class=\"age\">Limitations of the BLOOM AI Model:<\/em><\/strong><\/h2>\n<p id=\"6c55\" class=\"pw-post-body-paragraph aep aeq zh vs b aer agf aet aeu aev agg aex aey pf agh afa afb qt agi afd afe qx agj afg afh afi qk bj\" data-selectable-paragraph=\"\">One thing that limits its potential to be harnessed by every organization is itshigh running costs. The BLOOM model was trained on the 384 NVIDIA Tesla A100 GPUs, which cost around $32,000 each. The LLM Research is focused on training the model on bigger aspects, leading to rising training and running costs.<\/p>\n<p id=\"6b9e\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Moreover, the compressed version of BLOOM is 227 GB, and specialized hardware with hundreds of gigabytes of VRAM is required to operate and run the model. Compared to Chat GPT, it requires a large computing cluster equivalent to NVIDIA DGX 2, which costs around $400,000. However, Hugging Face plans to launch an API platform for the researchers at $40\/month, which may not be cost-effective.<\/p>\n<p id=\"a3f4\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Besides, the BLOOM model is trained on real datasets because of which it may generate biased content. This can lead to over-representing some figures, under-representing some facts, and encouraging stereotypes which can lead to the creation of factually incorrect content and the generation of repetitive texts.<\/p>\n<h2 id=\"ab85\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\"><strong class=\"al\"><em class=\"age\">Applications of BLOOM<\/em><\/strong><\/h2>\n<figure class=\"agl agm agn ago agp adq ux uy paragraph-image\">\n<div class=\"aef aeg dj aeh bg aei\" tabindex=\"0\" role=\"button\">\n<div class=\"ux uy agk\"><picture><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*gbH4QKjc0A6oMYbF 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*gbH4QKjc0A6oMYbF 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*gbH4QKjc0A6oMYbF 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*gbH4QKjc0A6oMYbF 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*gbH4QKjc0A6oMYbF 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*gbH4QKjc0A6oMYbF 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*gbH4QKjc0A6oMYbF 1400w\" type=\"image\/webp\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" \/><source srcset=\"https:\/\/miro.medium.com\/v2\/resize:fit:640\/0*gbH4QKjc0A6oMYbF 640w, https:\/\/miro.medium.com\/v2\/resize:fit:720\/0*gbH4QKjc0A6oMYbF 720w, https:\/\/miro.medium.com\/v2\/resize:fit:750\/0*gbH4QKjc0A6oMYbF 750w, https:\/\/miro.medium.com\/v2\/resize:fit:786\/0*gbH4QKjc0A6oMYbF 786w, https:\/\/miro.medium.com\/v2\/resize:fit:828\/0*gbH4QKjc0A6oMYbF 828w, https:\/\/miro.medium.com\/v2\/resize:fit:1100\/0*gbH4QKjc0A6oMYbF 1100w, https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*gbH4QKjc0A6oMYbF 1400w\" sizes=\"(min-resolution: 4dppx) and (max-width: 700px) 50vw, (-webkit-min-device-pixel-ratio: 4) and (max-width: 700px) 50vw, (min-resolution: 3dppx) and (max-width: 700px) 67vw, (-webkit-min-device-pixel-ratio: 3) and (max-width: 700px) 65vw, (min-resolution: 2.5dppx) and (max-width: 700px) 80vw, (-webkit-min-device-pixel-ratio: 2.5) and (max-width: 700px) 80vw, (min-resolution: 2dppx) and (max-width: 700px) 100vw, (-webkit-min-device-pixel-ratio: 2) and (max-width: 700px) 100vw, 700px\" data-testid=\"og\" \/><img decoding=\"async\" class=\"bg aej aek c\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/v2\/resize:fit:1400\/0*gbH4QKjc0A6oMYbF\" alt=\"\" width=\"700\" height=\"394\" \/><\/picture><\/div>\n<\/div>\n<\/figure>\n<blockquote class=\"afj afk afl\">\n<p id=\"1bfe\" class=\"aep aeq afm vs b aer aes aet aeu aev aew aex aey afn aez afa afb afo afc afd afe afp aff afg afh afi qk bj\" data-selectable-paragraph=\"\"><strong class=\"vs im\">BLOOM learning capabilities help in natural language processing<\/strong><\/p>\n<\/blockquote>\n<p id=\"0906\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The BLOOM AI model presents many applications throughout various industries and businesses. Its potential can be leveraged to improve operational efficiency and open new doorways for innovation. One of the potential applications of the BLOOM AI model can be seen in natural language processing which include but are not limited sentiment analysis, text summarization, and language translation.<\/p>\n<p id=\"6932\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">With proficient training in 46 languages and 13 programming languages, generating coherent text and content for different purposes, like marketing, content creation, and others, makes it helpful. Researchers and developers can use it for research and development purposes to build advanced language models and artificial intelligence tools.<\/p>\n<p id=\"45b4\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The researchers have warned about the authenticity of the content generated by the model, and factual content for math and history should not be trusted directly, thereby limiting its usage for biomedical, political, and legal purposes.<\/p>\n<h2 id=\"401c\" class=\"afq afr zh be afs aft afu afv pb pc afw pd pe afx afy afz aga pi agb pj pk pl agc pm pn agd bj\"><strong class=\"al\"><em class=\"age\">Wrapping up,<\/em><\/strong><\/h2>\n<p id=\"e25f\" class=\"pw-post-body-paragraph aep aeq zh vs b aer agf aet aeu aev agg aex aey pf agh afa afb qt agi afd afe qx agj afg afh afi qk bj\" data-selectable-paragraph=\"\">The BLOOM AI model opens the portal to next-level intelligence with its exceptional accuracy, scalability, flexibility, rapid learning, and natural language processing. All these abilities make it an excellent tool to implement in various industries to make operations easier.<\/p>\n<p id=\"0c02\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">The model\u2019s capacity to handle and analyze complex data, generate human-like responses, and take decisions based on ethical approaches makes it different from other language models. Organizations can leverage the potential of BLOOM to improve their operational efficiency and productivity. The progress in AI technology opens up new doors and unlocks opportunities to revolutionize the world, and BLOOM is one of the important stepping stones in the transformational journey.<\/p>\n<p id=\"c803\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">Thanks for sticking on till the end. We appreciate your interest and commitment in exploring this fascinating field. We hope that you found the information valuable and insightful.<\/p>\n<p id=\"4ea7\" class=\"pw-post-body-paragraph aep aeq zh vs b aer aes aet aeu aev aew aex aey pf aez afa afb qt afc afd afe qx aff afg afh afi qk bj\" data-selectable-paragraph=\"\">If you are interested in exploring Generative AI and have any relevant projects or collaborations in mind, we would be pleased to hear from you. Please feel free to\u00a0<a class=\"af jt\" href=\"https:\/\/www.codecrafttech.com\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"vs im\">contact us<\/strong><\/a><strong class=\"vs im\">\u00a0<\/strong>to discuss any ideas, questions, or potential opportunities. Once again, thank you for your readership, and we look forward to connecting with you!<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"ab cm mg nk sz agz\" role=\"separator\"><\/div>\n<div class=\"qk xr ze jv zf\">\n<div class=\"ab cm\">\n<div class=\"ed bg ee ef eg eh\">\n<h2 id=\"7ad8\" class=\"afq afr zh be afs aft ahe afv pb pc ahf pd pe afx ahg afz aga pi ahh pj pk pl ahi pm pn agd bj\"><em class=\"age\">About the Author:<\/em><\/h2>\n<p id=\"3a77\" class=\"pw-post-body-paragraph aep aeq zh vs b aer agf aet aeu aev agg aex aey pf agh afa afb qt agi afd afe qx agj afg afh afi qk bj\" data-selectable-paragraph=\"\"><a class=\"af jt\" href=\"https:\/\/www.linkedin.com\/in\/drkirankumarc\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"vs im\">Dr. Kiran Kumar<\/strong><\/a><strong class=\"vs im\">\u00a0<\/strong>is an accomplished AI researcher, innovator, and senior data scientist. With a Ph.D. in Supply Chain Analytics, he possesses a profound understanding of data analysis and machine-learning techniques. His extensive research contributions are showcased through numerous publications in esteemed international journals. Driven by a passion for pioneering advancements, he holds patents for groundbreaking innovations in the field. Currently, he is focused on developing cutting-edge products by leveraging his expertise in Prompt engineering and Generative AI.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>The emergence of artificial intelligence has created a breakthrough in the world. The BLOOM model is a versatile framework at the technology forefront with advanced capabilities of understanding natural language, machine learning, and problem-solving. The BLOOM model, \u201cBiologically Localized and\u00a0Online\u00a0One-shot Multi-Task Learning,\u201d is a machine learning framework,\u00a0breaking the frontiers in generative AI, that blends the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1757,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_mo_disable_npp":"no","_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[22,1],"tags":[75,73,74,62],"class_list":["post-1756","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blogs","category-highlights","tag-artificial-intelligence","tag-bloomai","tag-generativeai","tag-technology"],"acf":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/posts\/1756","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/comments?post=1756"}],"version-history":[{"count":7,"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/posts\/1756\/revisions"}],"predecessor-version":[{"id":1827,"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/posts\/1756\/revisions\/1827"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/media\/1757"}],"wp:attachment":[{"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/media?parent=1756"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/categories?post=1756"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codecrafttech.com\/resources\/wp-json\/wp\/v2\/tags?post=1756"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}