Glossary¶

Activities: Activities are steps or procedures contained in process guidance which are executed to satisfy one or more process requirements. Process guidance provides default activities which can be followed by teams, or teams may substitute or modify activities if those alternative activities meet the process requirements. Examples of activities include writing user stories, refining product backlogs, and writing or re-defining acceptance criteria
Algorithm: A mathematical formula or statistical process used to perform analysis of data.
Anonymization: Making data anonymous; severing of links between people in a database and their records to prevent the discovery of the source of the records.
Application Program Interface (API): A set of programming standards and instructions for accessing or building web-based software applications.
Artifacts: Artifacts are tangible by-products produced during product development. Artifacts defined by Scrum are specifically designed to maximize transparency of key information so that everybody has the same understanding of the product under development, the activities being planned and the activities in the project. Three primary artifacts of Scrum are the Product Backlog, the Sprint Backlog, and the Product Increment
Artificial Intelligence: The apparent ability of a machine to apply information gained from previous experience accurately to new situations in a way that a human would.
Azure DevOps: A Microsoft solution intended to provide a union of people, process, and products to enable continuous delivery of value to end users.
Batch Processing: Batch data processing is an efficient way of processing high volumes of data where a group of transactions is collected over a period of time. Hadoop is focused on batch data processing.
Big Data: Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze.
Business Intelligence: The general term used for the identification, extraction, and analysis of data.
Cloud: A broad term that refers to any internet-based application or service that is hosted remotely.
Cloud Computing: A distributed computing system hosted and running on remote servers and accessible from anywhere on the internet.
Code Artifacts: Any artifacts produced from the code including Code Quality or Security scan results, code complexity outputs, package metadata, Software Bill of Materials, etc.
Columnar Database: A database that stores data by column rather than by row. In a row-based database, a row might contain a name, address, and phone number. In a column-oriented database, all names are in one column, addresses in another and so on. A key advantage of a columnar database is faster hard disk access.
Common Vulnerability Scoring System (CVSS): Provides a way to capture the principal characteristics of a software vulnerability and produces a numerical score reflecting its severity. The numerical score can then be translated into a qualitative representation (such as low, medium, high, and critical) to help organizations properly assess and prioritize their vulnerability management processes.
Dashboard: A graphical representation of analyses performed by algorithms.
Data: facts or figures, or information that’s stored in or used by a system or computer.
Data Aggregation: The process of collecting data from multiple sources for the purpose of reporting or analysis.
Data Analytics: The process of examining large data sets to uncover hidden patterns, unknown correlations, trends, customer preferences and other useful business insights.
Data Architecture and Design: How enterprise data is structured. The actual structure or design varies depending on the eventual end result required. Data architecture has three stages or processes: (1) conceptual representation of business entities, (2) the logical representation of the relationships among those entities and (3) the physical construction of the system to support the functionality.
Data as a Service (DaaS): Treat data as a product. DaaS providers use cloud solutions to give on-demand access of data to customers.
Data Center: A physical facility that houses a large number of servers and data storage devices. Data centers might belong to a single organization or sell their services to many organizations.
Data Cleansing: The process of reviewing and revising data to delete duplicate entries, correct misspelling, and other errors, add missing data and provide consistency.
Data Element: A column or more than one column in a row.
Data Governance: A set of processes or rules that ensure data integrity and that data management best practices are met.
Data Integration: The process of combining data from different sources and presenting it in a single view.
Data Integrity: The measure of trust an organization has in the accuracy, completeness, timeliness and validity of the data.
Data Lake: A large repository of enterprise-wide data in raw format. Supposedly data lakes make it easy to access enterprise-wide data.
Data Mart: The access layer of a data warehouse used to provide data to users.
Data Mining: Finding meaningful patterns and deriving insights in large sets of data using sophisticated pattern recognition techniques. To derive meaningful patterns, data miners use statistics, machine learning algorithms and artificial intelligence.
Data Modeling: A data model defines the structure of the data for the purpose of communicating between functional and technical people to develop how data is stored and accessed among application development team members.
Data Science: A discipline that incorporates statistics, data visualization, computer programming, data mining, machine learning and database engineering to solve complex problems.
Data Warehouse: A repository for enterprise-wide data but in a structured format after cleaning and integrating with other sources.
Database: A digital collection of data and the structure around which the data is organized. The data is typically entered into and accessed via a database management system.
Distributed File System: A data storage system meant to store large volumes of data across multiple storage devices and will help decrease the cost and complexity of storing large amounts of data.
Events: An event is a prescribed occurrence used to create regularity and minimize the need for meetings not defined in Scrum. Events are designed to enable transparency and the opportunity for a scrum team to inspect and adapt its process. All events have a maximum duration and, except for the sprint, may end whenever the purpose of the event is achieved. Examples of events are, the Sprint, Sprint Planning, Daily Scrum, Sprint Review, and Sprint Retrospective.
Extract, Transform and Load (ETL): The process of extracting raw data, transforming by cleaning/enriching the data to make it fit operational needs and loading into the appropriate repository for the system’s use.
Machine Learning: A method of designing systems that can learn, adjust and improve based on the data fed to them. Using predictive and statistical algorithms that are fed to these machines, they learn and continually zero in on “correct” behavior and insights and they keep improving as more data flows through the system.
Metadata: Data about data; it gives information about what the data is about. For example, where data points were collected.
Multi-dimensional Databases: A database optimized for data online analytical processing (OLAP) applications and for data warehousing.
Not ONLY SQL (NoSQL): A broad class of database management systems identified by non-adherence to the widely used relational database management system model. NoSQL databases are not built primarily on tables and generally do not use SQL for data manipulation
Observation: A row of data. Same as data element but this term is commonly used in statistical studies.
Online Transaction Processing (OLTP): A software program or database supporting transaction-oriented applications. a transaction is a sequence of discrete information or data.
POP: Period of Performance
Population: A dataset that consists of all the members of some group.
Predictive Analytics: Using statistical functions on one or more data sets to predict trends or future events.
Prescriptive Analytics: Prescriptive analytics builds on predictive analytics by including actions and make data-driven decisions by looking at the impacts of various actions.
Process Quality Assurance (PQA): The Process Quality Assurance processes support the delivery of high-quality products by providing project staff and managers at all levels with appropriate visibility into, and feedback on, processes and associated work products throughout the life of the project.
Product Integration Schedule: The Product Integration Schedule is defined in the Product Integration Plan It is the schedule for the completion of work items and how they will be compiled, packaged, and deployed.
Roles: Any of a number of agile and/or organizationally defined roles. An employee may serve multiple concurrent roles depending on their skills, experience, and team needs. Some roles may be modified and/or informally defined at the software development team level via the tailoring process. The recognized roles in Scrum are Product Owner, Scrum Master, and Development Team
Sample: A data set which consists of only a portion of the members from some population. Sample statistics are used to draw inferences about the entire population from the measurements of a sample.
Software as a Service (SaaS): Enables vendors to host an application and make it available via the internet (cloud servicing). SaaS providers provide services over the cloud rather than hard copies.
Software Supply Chain Management: As it pertains to InnovaSystems, software supply chain management is having an understanding of where our 3rd party software or libraries come from and managing any potential risk around the supply and use of that software. The risk associated with security vulnerabilities introduced via 3rd party software libraries is very real. Risks associated with software supply chain management are: obsolesce, vendor insolvency, and licensing. If a team has issues in their software supply chain, that is a form of technical debt.
Spark (Apache Spark): A fast, in-memory open source data processing engine to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets.
Sprint time-box: The duration of a sprint. Usually one to three weeks.
Tailoring: Tailoring is the process by which teams may receive organizational approval to use procedures for an activity that better suit the specific mission, skills, and customer requirements of their team, in lieu of the default procedures for that activity. Tailoring can apply to; roles, events, activities, and artifacts, as long as the tailored procedure is deemed to meet applicable process requirements. Managed Process Improvement (MPI) manages tailoring statements, which are managed within the Governance tier.
Technical Debt: The term technical debt is a metaphor in the software development profession. The idea is technical debt is like financial debt and it needs to be managed. Technical debt is relative to the operating environment. If something slows a team down or will slow a team down it the future, chances are it can be called technical debt. Just as companies take on financial debt, development teams may choose to incur technical debt, it is not a bad thing, but it does need to be managed and prioritized.
Visualization: A visual abstraction of data designed for the purpose of deriving meaning or communicating information more effectively. Visuals created are usually complex, but understandable in order to convey the message of data.

Process Guidance Version: 10.4