<?xml version="1.0"?>
<!DOCTYPE article
PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20190208//EN"
       "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="EDITORIAL" dtd-version="1.4" xml:lang="en">
 <front>
  <journal-meta>
   <journal-id journal-id-type="publisher-id">NATURAL AND MAN-MADE RISKS (PHYSICO-MATHEMATICAL AND APPLIED ASPECTS)</journal-id>
   <journal-title-group>
    <journal-title xml:lang="en">NATURAL AND MAN-MADE RISKS (PHYSICO-MATHEMATICAL AND APPLIED ASPECTS)</journal-title>
    <trans-title-group xml:lang="ru">
     <trans-title>ПРИРОДНЫЕ И ТЕХНОГЕННЫЕ РИСКИ (ФИЗИКО-МАТЕМАТИЧЕСКИЕ И ПРИКЛАДНЫЕ АСПЕКТЫ)</trans-title>
    </trans-title-group>
   </journal-title-group>
   <issn publication-format="print">2307-7476</issn>
  </journal-meta>
  <article-meta>
   <article-id pub-id-type="publisher-id">75149</article-id>
   <article-id pub-id-type="doi">10.61260/2307-7476-2024-2023-4-45-52</article-id>
   <article-categories>
    <subj-group subj-group-type="toc-heading" xml:lang="ru">
     <subject>ИНЖЕНЕРНОЕ И ИНФОРМАЦИОННОЕ ОБЕСПЕЧЕНИЕ БЕЗОПАСНОСТИ ПРИ ЧРЕЗВЫЧАЙНЫХ СИТУАЦИЯХ</subject>
    </subj-group>
    <subj-group subj-group-type="toc-heading" xml:lang="en">
     <subject>ENGINEERING AND INFORMATION SECURITY IN EMERGENCY SITUATIONS</subject>
    </subj-group>
    <subj-group>
     <subject>ИНЖЕНЕРНОЕ И ИНФОРМАЦИОННОЕ ОБЕСПЕЧЕНИЕ БЕЗОПАСНОСТИ ПРИ ЧРЕЗВЫЧАЙНЫХ СИТУАЦИЯХ</subject>
    </subj-group>
   </article-categories>
   <title-group>
    <article-title xml:lang="en">PROCESSING SOFTWARE BIG DATA</article-title>
    <trans-title-group xml:lang="ru">
     <trans-title>ПРОГРАММНЫЕ СРЕДСТВА ОБРАБОТКИ  БОЛЬШИХ ОБЪЕМОВ ДАННЫХ</trans-title>
    </trans-title-group>
   </title-group>
   <contrib-group content-type="authors">
    <contrib contrib-type="author">
     <contrib-id contrib-id-type="orcid">https://orcid.org/0000-0001-2735-4189</contrib-id>
     <name-alternatives>
      <name xml:lang="ru">
       <surname>Лабинский</surname>
       <given-names>Александр Юрьевич</given-names>
      </name>
      <name xml:lang="en">
       <surname>Labinsky</surname>
       <given-names>Alexander Yu.</given-names>
      </name>
     </name-alternatives>
     <email>labinsciy@yandex.ru</email>
     <bio xml:lang="ru">
      <p>кандидат технических наук;</p>
     </bio>
     <bio xml:lang="en">
      <p>candidate of technical sciences;</p>
     </bio>
     <xref ref-type="aff" rid="aff-1"/>
    </contrib>
   </contrib-group>
   <aff-alternatives id="aff-1">
    <aff>
     <institution xml:lang="ru">Санкт-Петербургский университет ГПС МЧС России</institution>
     <city>Санкт-Петербург</city>
     <country>Россия</country>
    </aff>
    <aff>
     <institution xml:lang="en">Saint-Petersburg university of State fire service of EMERCOM of Russia</institution>
     <city>Saint-Petersburg</city>
     <country>Russian Federation</country>
    </aff>
   </aff-alternatives>
   <pub-date publication-format="print" date-type="pub" iso-8601-date="2024-02-14T17:24:04+03:00">
    <day>14</day>
    <month>02</month>
    <year>2024</year>
   </pub-date>
   <pub-date publication-format="electronic" date-type="pub" iso-8601-date="2024-02-14T17:24:04+03:00">
    <day>14</day>
    <month>02</month>
    <year>2024</year>
   </pub-date>
   <volume>2023</volume>
   <issue>4</issue>
   <fpage>45</fpage>
   <lpage>52</lpage>
   <history>
    <date date-type="received" iso-8601-date="2023-08-28T00:00:00+03:00">
     <day>28</day>
     <month>08</month>
     <year>2023</year>
    </date>
    <date date-type="accepted" iso-8601-date="2023-11-10T00:00:00+03:00">
     <day>10</day>
     <month>11</month>
     <year>2023</year>
    </date>
   </history>
   <self-uri xlink:href="https://journals.igps.ru/en/nauka/article/75149/view">https://journals.igps.ru/en/nauka/article/75149/view</self-uri>
   <abstract xml:lang="ru">
    <p>Рассмотрены возможности программных средств обработки больших объемов данных (Big Data). В центре внимания статьи находятся  инструменты платформы Apache NiFi, входящие в набор Hadoop-инструментов для бизнес-экосистем.&#13;
Подробно рассмотрены такие средства, как свободно распространяемый набор утилит и библиотек для разработки и выполнения распределенных программ (Hadoop Common), включающий в себя библиотеки управления системами файлов и сценарии по управлению распределённой обработкой данных и созданию инфраструктуры, необходимой для этой обработки.&#13;
Рассмотрены инструменты платформы Apache NiFi, в том числе набор современных ETL-инструментов (Extract, Transform, Load) для разработки хранилища большого объема данных, а также основные понятия платформы Apache NiFi, спользующей концепцию «Flow Based Programming» (FBP).&#13;
Произведена оценка эффективности параллельной обработки данных</p>
   </abstract>
   <trans-abstract xml:lang="en">
    <p>The article considers possibilities of software tools for processing large volumes of data (Big Data). The article focuses on the Apache NiFi platform tools, which are part of the Hadoop suite of tools for business ecosystems.&#13;
Tools such as Hadoop Common, which include libraries for managing the file systems supported by Hadoop, and scenarios for creating the necessary infrastructure and managing distributed data processing, are discussed in detail.&#13;
The tools of the Apache NiFi platform are considered, including a set of modern ETL-tools (Extract, Transform, Load) for the development of a large data storage, as well as the basic concepts of the Apache NiFi platform, based on the concept of «Flow Based Programming» (FBP).&#13;
The evaluation of the efficiency of parallel data processing has been made, which has shown that with the increase of the share of consecutive operations in the computer program of data processing the degree of acceleration of calculations decreases.&#13;
The topic of the article is relevant, as large data sets are now used everywhere and their processing daily gives a significant positive effect.</p>
   </trans-abstract>
   <kwd-group xml:lang="ru">
    <kwd>программные средства</kwd>
    <kwd>большие объемы данных</kwd>
    <kwd>параллельная обработка данных</kwd>
    <kwd>платформа Apache NiFi</kwd>
    <kwd>ETL-инструменты</kwd>
    <kwd>Hadoop-инструмены</kwd>
    <kwd>бизнес-экосистема</kwd>
    <kwd>концепция Flow Based Programming</kwd>
    <kwd>дистрибутив Hortonworks Data Platform</kwd>
   </kwd-group>
   <kwd-group xml:lang="en">
    <kwd>software</kwd>
    <kwd>large amounts of data</kwd>
    <kwd>parallel data processing</kwd>
    <kwd>Apache NiFi platform</kwd>
    <kwd>ETL-tools</kwd>
    <kwd>Hadoop-tools</kwd>
    <kwd>business ecosystem</kwd>
    <kwd>Flow Based Programming concept</kwd>
    <kwd>Hortonworks Data Platform distribution</kwd>
   </kwd-group>
  </article-meta>
 </front>
 <body>
  <p></p>
 </body>
 <back>
  <ref-list>
   <ref id="B1">
    <label>1.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Баканов В.И. Динамика потоковых вычислений. М.: Труды НИУ ВШЭ, 2021.</mixed-citation>
     <mixed-citation xml:lang="en">Bakanov V.I. Dinamika potokovyh vychislenij. M.: Trudy NIU VSHE, 2021.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B2">
    <label>2.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Лэм Чак. Hadoop в действии. ДМК Пресс, 2012.</mixed-citation>
     <mixed-citation xml:lang="en">Lem Chak. Hadoop v dejstvii. DMK Press, 2012.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B3">
    <label>3.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Уайт Том. Hadoop. Подробное руководство. СПб.: Питер, 2013.</mixed-citation>
     <mixed-citation xml:lang="en">Uajt Tom. Hadoop. Podrobnoe rukovodstvo. SPb.: Piter, 2013.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B4">
    <label>4.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Vance Ashlee. Hadoop, a Free Software Program, Finds Uses Beyond Search. N.Y.: The New York Times, 2009.</mixed-citation>
     <mixed-citation xml:lang="en">Vance Ashlee. Hadoop, a Free Software Program, Finds Uses Beyond Search. N.Y.: The New York Times, 2009.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B5">
    <label>5.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Shvachko Konstantin. Apache Hadoop. Coriolis, 2011.</mixed-citation>
     <mixed-citation xml:lang="en">Shvachko Konstantin. Apache Hadoop. Coriolis, 2011.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B6">
    <label>6.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Sharp J.A. Data Flow Computing: Theory and Practice. Intellect Limited, 1992.</mixed-citation>
     <mixed-citation xml:lang="en">Sharp J.A. Data Flow Computing: Theory and Practice. Intellect Limited, 1992.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B7">
    <label>7.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Carkci M. Dataflow and Reactive Programming Systems: A Practical Guide. CreateSpace Independent Publishing Platform, 2014.</mixed-citation>
     <mixed-citation xml:lang="en">Carkci M. Dataflow and Reactive Programming Systems: A Practical Guide. CreateSpace Independent Publishing Platform, 2014.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B8">
    <label>8.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Wesley M. Johnston, J.R. Paul Hanna, Richard J. Millar. Advances in Dataflow Programming Languages. N.Y. and London,  2015.</mixed-citation>
     <mixed-citation xml:lang="en">Wesley M. Johnston, J.R. Paul Hanna, Richard J. Millar. Advances in Dataflow Programming Languages. N.Y. and London,  2015.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B9">
    <label>9.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">David Loshin. ETL (Extract, Transform, Load) // Business Intelligence and Analytics. Morgan Kaufmann, 2012.</mixed-citation>
     <mixed-citation xml:lang="en">David Loshin. ETL (Extract, Transform, Load) // Business Intelligence and Analytics. Morgan Kaufmann, 2012.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B10">
    <label>10.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">David Haertzen. ETL Tools // Business Intelligence and Analytics. Technics Publications, 2012.</mixed-citation>
     <mixed-citation xml:lang="en">David Haertzen. ETL Tools // Business Intelligence and Analytics. Technics Publications, 2012.</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B11">
    <label>11.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Ralph Kimball, Joe Caserta. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. John Wiley &amp; Sons, 2004.</mixed-citation>
     <mixed-citation xml:lang="en">Ralph Kimball, Joe Caserta. The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. John Wiley &amp; Sons, 2004.</mixed-citation>
    </citation-alternatives>
   </ref>
  </ref-list>
 </back>
</article>
