HOME

TheInfoList



OR:

Nirvana was virtual object storage software developed and maintained by General Atomics. It can also be described as
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
, data placement and
data management Data management comprises all disciplines related to handling data as a valuable resource. Concept The concept of data management arose in the 1980s as technology moved from sequential processing (first punched cards, then magnetic tape) to r ...
software that lets organizations manage unstructured data on multiple storage devices located anywhere in the world in order to orchestrate global data intensive
workflow A workflow consists of an orchestrated and repeatable pattern of activity, enabled by the systematic organization of resources into processes that transform materials, provide services, or process information. It can be depicted as a sequence of ...
s, and search for and locate data no matter where it is located or when it was created. Nirvana does this by capturing system and user-defined metadata to enable detailed search and enact policies to control data movement and protection. Nirvana also maintains data
provenance Provenance (from the French ''provenir'', 'to come from/forth') is the chronology of the ownership, custody or location of a historical object. The term was originally mostly used in relation to works of art but is now used in similar senses i ...
,
audit An audit is an "independent examination of financial information of any entity, whether profit oriented or not, irrespective of its size or legal form when such an examination is conducted with a view to express an opinion thereon.” Auditing ...
, security and
access control In the fields of physical security and information security, access control (AC) is the selective restriction of access to a place or other resource, while access management describes the process. The act of ''accessing'' may mean consuming ...
. Nirvana can reduce storage costs by identifying data to be moved to lower cost storage and data that no longer needs to be stored.


History

Nirvana is the result of research started in 1995 at the San Diego Supercomputer Center (SDSC) (which was founded by and run at the time by General Atomics), in response to a
DARPA The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military. Originally known as the Adv ...
sponsored project for a Massive Data Analysis System. Led by General Atomics computational plasma physicist Dr. Reagan Moore, development continued through the cooperative efforts of General Atomics and the SDSC on the Storage Resource Broker (SRB), with the support of the
National Science Foundation The National Science Foundation (NSF) is an independent agency of the United States government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National I ...
(NSF). SRB 1.1 was delivered in 1998, demonstrating a logical distributed file system with a single Global Namespace across geographically distributed storage systems. In 2003, General Atomics turned over operation of the SDSC to the University of California San Diego (UCSD) and Dr. Moore became a full-time professor there establishing the Data Intensive Computing Environments (DICE) Center, continuing development of SRB. In that same year, General Atomics acquired the exclusive license to develop a commercial version of SRB, calling it Nirvana. The DICE team ended development of SRB in 2006 and started a rules oriented data management project called iRODS for
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
distribution. Dr. Moore and his DICE team relocated to the
University of North Carolina at Chapel Hill A university () is an institution of higher (or tertiary) education and research which awards academic degrees in several academic disciplines. Universities typically offer both undergraduate and postgraduate programs. In the United States ...
where iRODS is now maintained by the iRODS Consortium. General Atomics continued development of Nirvana at their San Diego headquarters, focusing on capabilities to serve government and commercial users, including high scalability, fail-over, performance, implementation, maintenance and support. In 2009, General Atomics won a data management contract with the US Department of Defense (DOD) High Performance Computing Modernization Program. The requirements of this contract focused General Atomics to expand Nirvana’s performance, scalability, security and ease of use. A major deliverable involved integrating Nirvana with
Oracle Corporation Oracle Corporation is an American multinational computer technology corporation headquartered in Austin, Texas. In 2020, Oracle was the third-largest software company in the world by revenue and market capitalization. The company sells da ...
's SAM-QFS filesystem to provide a policy-based Hierarchical Storage Management (HSM) system with near real-time event synchronization. General Atomics also announced that digital marketing firm infoGROUP deployed Nirvana to create a Global Name Space across three of infoGROUP’s computer operations centers in the Omaha area. In 2012, General Atomics released Nirvana version 4.3. In 2014, General Atomics changed the Nirvana business model from a large government contract, fee for service model, to a standard commercial software model. In 2015, General Atomics initiated a strategic relationship wit
pixitmediaarcastream
in the United Kingdom, integrating Nirvana with pixitmedia and arcastream’s products. In 2016, General Atomics released Nirvana version 5.0. In May 2018, probes of Nirvana marketing and support URLs under the General Atomics corporate umbrella (www.Nirvanastorage.com, www.ga.com/nirvana and https://www.nirvanaware.com) and more recently branded integration offerings such as "Nirvana EasyHSM" (www.ga.com/easyhsm (mentioned in a Jan. 2017 marketing slideshare at )) return "cannot be found" from www.ga.com or connection timeout. A "Nirvana" keyword search at www.ga.com returns only pages with archived indications. Nirvana pages and press releases archived by General Atomics are retrievable via http://www.ga.com/?Key=Search&q=nirvana


Architecture and operation

Nirvana is client-server software composed of Location Agents that reside on, or access, Storage Resources. A Storage Resource can be a networked-attached storage (NAS) system, object storage system or
cloud storage Cloud storage is a model of computer data storage in which the digital data is stored in logical pools, said to be on "the cloud". The physical storage spans multiple servers (sometimes in multiple locations), and the physical environment is t ...
service. Nirvana catalogs the location of the files and objects in these storage resources into its Metadata Catalog (MCAT) and tags the files with storage system metadata (Owner, File Name, File Size and Creation, Change, Modification and Access
Timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolut ...
s) and additional user-defined, domain specific metadata. System and user-defined metadata can be used to search for a file or object (or groups of files and objects) and also control access to and move those files and objects from one storage resource to another. The MCAT creates a single Global Namespace across all Storage Resources connected to it so users and administrators can search for, access, and move data across multiple heterogeneous storage systems from multiple vendors across geographically dispersed data centers. The MCAT is connected to and interacts with a
relational database management system A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
to support its operation. Multiple MCATs can be deployed for horizontal scale-out and
failover Failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network in a computer net ...
. Various Clients can interact with Nirvana including the supplied
Web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
and
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
based
GUI The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
Clients, a
Command Line Interface A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and pro ...
, a native
Windows Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
virtual network drive interface, and user-developed applications via supplied APIs. Nirvana operation is controlled by three daemons; Metadata, Sync and ILM. The Metadata Daemon can extract metadata automatically from an instrument creating data, from within the file's actual data using predefined and customizable templates and metadata parsing policies, or capturing user input via the GUI or Command Line Interface. The Sync Daemon, running in the background, detects when files are added to, or deleted from, the underlying Storage Resource filesystems. When filesystem changes are observed by the Sync Daemon, the changes are registered and updated in MCAT. The ILM Daemon routinely queries the MCAT and executes actions including migration, replication, or
backup In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", w ...
on a specified schedule. For example, an administrator can set a policy to free up space on an expensive primary storage system by migrating that data to distributed retention locations based on criteria such as: storage consumption watermarks (percent full), all data associated with a specific project, or data that hasn't been accessed in over one year. The policies are extremely flexible. User-defined metadata attributes (e.g. Project,
Principal investigator In many countries, the term principal investigator (PI) refers to the holder of an independent grant and the lead researcher for the grant project, usually in the sciences, such as a laboratory study or a clinical trial. The phrase is also often us ...
, Data source, Location, Temperature, etc.) can also be used to move data. Nirvana ILM policy execution occurs behind the scenes, transparent to
end-users In product development, an end user (sometimes end-user) is a person who ultimately uses or is intended to ultimately use a product. The end user stands in contrast to users who support or maintain the product, such as sysops, system administrat ...
or applications.


Use cases


Data-aware cloud storage gateway

Nirvana's ILM functionality can be used as a cloud storage gateway, where data stored locally, on premises, can be moved to popular
cloud storage service A file-hosting service, cloud-storage service, online file-storage provider, or cyberlocker is an internet hosting service specifically designed to host user files. It allows users to upload files that could be accessed over the internet afte ...
s based on Nirvana's various metadata attributes and policies. In 2015, General Atomics and ArcaStream announced a cloud storage appliance that uses IBM's Spectrum Scale for on premises storage and integrates with cloud storage providers Amazon S3, and
Google Cloud Storage Google Cloud Storage is a RESTful online file storage web service for storing and accessing data on Google Cloud Platform infrastructure. The service combines the performance and scalability of Google's cloud with advanced security and sharing ...
.


Advanced search

Nirvana can be used to conduct search queries to find data of interest using both system and user-defined metadata. Queries are either entered in the Command Line Interface or through the Web browser client shown below.


Virtual collections

Nirvana can automate the grouping and distribution of data files into a virtual collection - based on user-friendly logical rules. For example, user-defined metadata can be used to identify data files needing to be transferred between collaborators with domain-specific attributes (experiment, study, project, etc.).


Data provenance

In many fields, it is helpful to know the
provenance Provenance (from the French ''provenir'', 'to come from/forth') is the chronology of the ownership, custody or location of a historical object. The term was originally mostly used in relation to works of art but is now used in similar senses i ...
and processing pipeline used to produce derived results. Nirvana tracks data within workflows, through all transformations, analyses, and interpretations. With Nirvana, data can be shared and used with verified provenance of the conditions under which it was generated – so results are reproducible and analyzable for defects.


Audit

Nirvana can be used to
audit An audit is an "independent examination of financial information of any entity, whether profit oriented or not, irrespective of its size or legal form when such an examination is conducted with a view to express an opinion thereon.” Auditing ...
every transaction on a data file within a workflow. An audit trail can be stored containing information such as date of transaction, success or error code, user performing transaction, type of transaction and notes, etc. Audit trails, like everything else with Nirvana, can be easily queried and filtered.


Security and access control

Nirvana can be used to control access to data by setting up specific access control lists by user, group etc. using user-defined metadata attributes (Project, Study, etc.) and by setting access privilege levels where users assigned higher levels can see more information than others assigned lower levels. Nirvana supports single sign-on and access by integrating with the
Lightweight Directory Access Protocol The Lightweight Directory Access Protocol (LDAP ) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. Directory servi ...
(LDAP) and
Active Directory Active Directory (AD) is a directory service developed by Microsoft for Windows domain networks. It is included in most Windows Server operating systems as a set of processes and services. Initially, Active Directory was used only for centralize ...
, using
challenge–response authentication In computer security, challenge–response authentication is a family of protocols in which one party presents a question ("challenge") and another party must provide a valid answer ("response") to be authenticated. The simplest example of a cha ...
,
Grid Security Infrastructure The Grid Security Infrastructure (GSI), formerly called the Globus Security Infrastructure, is a specification for secret, tamper-proof, delegatable communication between software in a grid computing environment. Secure, authenticatable communicat ...
(GSI), and Kerberos. Data can only be viewed and modified by users authorized to do so.


File system analysis

Nirvana can be used to analyze the makeup of a shared filesystem to determine what type of data is being stored, how much space it takes up, when it was last accessed, and who stored it. With this information, storage administrators can determine the most appropriate type of storage system to use and when to move unused data to lower cost archive storage. In the example below, Nirvana's analysis of data stored on an expensive enterprise NAS storage system showed most data hadn't been accessed in over 2 years. The analysis further showed that most files were very small, and over half the storage was consumed by just two users. Using this data, the organization replaced their enterprise storage system with less expensive object storage to better manage the many small, seldom accessed, files.


References

{{reflist Data processing Data analysis software