Data provenance can be defined as the origins description of a piece of data and its processing history [BKW01]. Provenance data brings transparency and helps to audit and interpret data [VLM+14]. To capture and analyze data provenance, provenance models can be used, such as the W3C PROV model [PROV]. However, PROV does not capture the specificities of a software process model. Thus, extensions must be done to capture provenance data from software processes. In order to enable the capture and recovery of provenance data coming from software development process (SDP) we propose PROV-SwProcess, a standard for SDP provenance representation. PROV-SwProcess is defined as an extension of the W3C recommended standard PROV, aiming to capture and store the most relevant information about SDP provenance data properly.
This document specifies the PROV-SwProcess model and details its constituting parts, including examples of how to use of the proposed model.
This specification is part of a in progress doctoral thesis and is under review. It specifies a potential standard published publicly for evaluation and possible adoption. However, it is not associated with or is supported by any standards organization.
This document was developed as a partnership between two universities (Federal University of Rio de Janeiro and Federal University of Juiz de Fora) and was published as a proposal for a standard. If you wish to make comments regarding this document, please send them to gabriellacbc@cos.ufrj.br. All comments are welcome.
Provenance data is a record of the data derivation history, which enables reproducibility, results interpretation and problems diagnosis [LLC+10].
Buneman et al. [BKW01] defines provenance data as the description of the origin of the data and the process by which it passed. The capture of this origin can be made considering both the data and the process.
Provenance data can be captured prospectively and retrospectively [FKS+08]. Prospective provenance captures the steps to generate a product, allowing the registration of specifying a computational task, such as a set of processes and / or a script. Retrospective provenance captures the performed steps by a computational task, as well as environmental information used to derive a specific product, i.e. a detailed log of the task.
According to Simmhan [S07], the provenance of processes involves the description of the tasks that are part of a process. In certain cases, to obtain the information of the data generation of a given product, it is also important to register each data consumed by the process. Thus, to capture the process provenance, it is necessary to obtain a description of the implementation of all tasks effected, in order to have the information regarding the success or failure of these tasks during execution.
In order to obtain the benefits of provenance information, the capture of software process provenance data is required as well as its storage for later access and analysis. One way to analyze and check the quality of generated data from software processes is through the use of techniques and data provenance models. Thus, there are two main models proposed in the literature that deals with data provenance: OPM [MCF+11] and PROV model [PROV].
Considering the existence of different systems for the SDP execution and the lack of a standard model to capture the provenance of these processes, the PROV-SwProcess model was defined. In addition to capturing / storing these kind of provenance data, this model provides a structure that allows a better analysis from software process provenance data later.
PROV-SwProcess was developed as an extension of PROV model in order to capture software process provenance data. This extension was developed considering that PROV is more general, and do not provide all concepts related to SDP.
PROV-SwProcess uses as basis for its definition the package of Software Process Execution of the Software Process Ontology (SPO) [SPO] . This ontology stablishes a common conceptualization about the software process domain and includes processes, activities, resources, people, artifacts, and procedures.
Another extension of PROV model specification is the D-PROV [MDB+13]. It has the aim of representing process structure, i.e., to enable the storage and query using prospective provenance. It shows an example of using D-PROV in the context of scientific workflows. D-PROV was a previous version of ProvONE [VLM+14]. ProvONE is a model for scientific workflow provenance that extends PROV with its specific structure elements. This technical specification about PROV-SwProcess used the ProvONE standard as basis. It was developed in the context of DataONE Project, a large scale and federated data infrastructure for the earth sciences community. Although this model is useful in scientific workflow domain, it does not suffice for capturing and analyzing provenance in the SDP domain. For example, in ProvONE, the workflow execution corresponds to the execution of computational tasks only by software agents but, in the software process context, we need to express different types of agents, such as, person, software agent and organizations.
A preliminary proposal of PROV-SwProcess (called PROV-Process) was published in [D16]. It is an initial approach to apply the PROV model in the context of software process. PROV-SwProcess aims to incorporate the basic ideas of this work, as well as additional contributions, to derive an adequate standard that can be used in the SDP domain.
PROV-SwProcess aims to provide the fundamental information required to understand and analyze provenance data from SDP. Considering this, it covers the prospective and retrospective provenance [FKS+08] and the essential aspects of SDP: activities, stakeholder, resource, procedure, and artifact, as proposed in [SPO]. Each of these aspects is described next.
Section 2 provides an overview of the PROV-SwProcess conceptual model, covering the aspects outlined in Section 1.2. The conceptual model of PROV-SwProcess is given using the Unified Modeling Language [UML].
Section 3 provides a detailed characterization of the various components of PROV-SwProcess, which is serialized as an OWL 2 ontology. It clarifies how the PROV-SwProcess concepts are related to the PROV and SPO concepts, including examples.
Section 4 describes PROV-SwProcess inferences that may be used on software process provenance data. An inference is a rule that can be applied to PROV-SwProcess instances to add new PROV-SwProcess statements.
References gives the full references to additional resources used to define the PROV-SwProcess standard.
The following namespaces and prefixes are used throughout this document.
prefix | namespace IRI | definition |
prov | http://www.w3.org/ns/prov# | The PROV namespace [PROVO] |
provswprocess | http://purl.org/provswprocess | The PROV-SwProcess namespace [PROV-SwProcess] |
xsd | http://www.w3.org/2000/10/XMLSchema# | XML Schema namespace [XMLSCHEMA11-2] |
rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# | The RDF namespace [RDF-CONCEPTS] |
rdfs | http://www.w3.org/2000/01/rdf-schema# | The RDFS namespace [RDF-SCHEMA] |
owl | http://www.w3.org/2002/07/owl# | OWL 2 specification namespace [OWL2] |
: | http://example.com/ | Artificial namespace for examples |
This section briefly describes PROV-SwProcess using diagrams to represent its conceptual model (Figures 1, 2, and 3).
The following points should be considered regarding PROV-SwProcess conceptual model:
Figure 1 presents PROV-SwProcess model constructs, considering its Retrospective Provenance part. Besides that, when there is more than one SDP instance to be analyzed, the relation WasComposedBy can also be inferred, as show in Figure 2 , allowing to obtain all the stakeholders, resources, artifacts, and procedures involved in a specific executed SDP instance.
Figure 3 presents PROV-SwProcess model constructs, considering its Prospective Provenance part.
All the PROV-SwProcess constructs are summarized in Table 2. The first column lists the aspects covered by PROV-SwProcess, serving to indicate the various constructs associated with each aspect. The second and third columns indicate the type of each construct as presented in the UML class diagram (class or association) and the construct name, respectively. The last column contains a link to each construct specification in Section 3.
This section presents the specification of the PROV-SwProcess constructs, as presented in Figures 1, 2, 3, and Table 2. The specification uses an OWL 2 [OWL2] ontology that extends the W3C PROV-O ontology [PROVO].
The namespace for all PROV-SwProcess terms is http://purl.org/provswprocess.
A Software Process ? represents a software development process in its entirety.
Considering its retrospective provenance, a Software Process was composed by executed activities (provswprocess:wasComposedBy) and attributed to a responsible stakeholder, using the association prov:wasAttributedTo.
When considering its prospective provenance, a Software Process is composed by activities (using the prospective relation provswprocess:isComposedBy) and may have a stakeholder responsible (provswprocess:hasResponsible)
IRI:http://purl.org/provswprocess#Software_Process
is in domain ofThe following example shows a Software_Process identified as New_Resource_Development. It was composed by the activities New_Resource_Specification, Codification, Test_Cases_Definition, Test, and Deploy (line 10), and was assigned to the Stakeholder Simon, using the relation wasAttributedTo (line 11). In lines 13 and 14, the prospective provenance of this Software_Process was stablished. It is composed by the same five activities listed on its retrospective provenance (New_Resource_Specification, Codification, Test_Cases_Definition, Test, and Deploy) and Simon stakeholder is the process responsible. In this example, there were no differences between what was planned and what was actually executed.
1 @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . 2 @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . 3 @prefix owl: <http://www.w3.org/2002/07/owl#> . 4 @prefix prov: <http://www.w3.org/ns/prov#> . 5 @prefix provswprocess: <http://purl.org/provswprocess#> . 6 @prefix : <http://example.com/> . 7 8 :New_Resource_Development 9 a owl:NamedIndividual , provswprocess:Software_Process ; 10 provswprocess:wasComposedBy :New_Resource_Specification , :Codification , :Test_Cases_Definition , :Test , :Deploy ; 11 prov:wasAttributedTo :Simon; 12 13 provswprocess:isComposedBy :New_Resource_Specification , :Codification , :Test_Cases_Definition , :Test , :Deploy ; 14 provswprocess:hasResponsible :Simon 15 . 16 :Simon a owl:NamedIndividual , provswprocess:Person_Stakeholder .
An Activity ? (prov:activity) is adopted in PROV-SwProcess to represent a computational task in the software development process. It can be atomic or composite and may include the adoption of procedures, the use of resources, the modification, use and generation of artifacts, and the association with stakeholders responsible for its execution.
IRI:http://www.w3.org/ns/prov#Activity
is in domain ofThe following fragment specifies an Activity identified as Codification and all its associations. As an example, line 5 shows that this activity changed Payment_Component (a Software_Item, as specified in line 25).
1 :Codification 2 a owl:NamedIndividual , prov:Activity ; 3 prov:used :Eclipse_IDE , :Financial_Module , :Requirements_Document , :UML_class_model ; 4 prov:wasAssociatedWith :Derek , :Simon ; 5 provswprocess:changed :Payment_Component ; 6 prov:endedAtTime "2017-01-15T18:00:00Z"^^xsd:dateTime ; 7 prov:startedAtTime "2017-01-15T13:00:00Z"^^xsd:dateTime ; 8 9 provswprocess:isAssociatedWith :Programmer ; 10 provswprocess:precedes :Test ; 11 provswprocess:uses :Eclipse_IDE 12 . 13 :Eclipse_IDE a owl:NamedIndividual , provswprocess:Software_Product . 14 :Financial_Module a owl:NamedIndividual , provswprocess:Software_Item . 15 :Requirements_Document a owl:NamedIndividual , provswprocess:Document . 16 :UML_class_model a owl:NamedIndividual , provswprocess:Model . 17 :Derek 18 a owl:NamedIndividual , provswprocess:Person_Stakeholder ; 19 prov:actedOnBehalfOf :Simon; 20 21 provswprocess:actsOnBehalfOf :Simon ; 22 provswprocess:hasRole :Programmer 23 . 24 :Simon a owl:NamedIndividual , provswprocess:Person_Stakeholder . 25 :Payment_Component a owl:NamedIndividual , provswprocess:Software_Item . 26 :Programmer a owl:NamedIndividual , provswprocess:Stakeholder_Role .
The relation Adopted? represents the adoption of some procedure (a method, a document template, or a technique) by an activity, considering the SDP retrospective provenance. This relation was not specified in PROV but was created based on SPO.
IRI:http://purl.org/provswprocess#adopted
has domain has rangeThe following example shows in line 3 the adoption of Test_Cases_Template (a Document_Template) by the activity Test_Cases_Definition.
1 :Test_Cases_Definition 2 a owl:NamedIndividual , prov:Activity ; 3 provswprocess:adopted :Test_Cases_Template ; 4 prov:generated :Payment_Test_Cases ; 5 prov:used :Requirements_Document ; 6 prov:wasAssociatedWith :Mary; 7 prov:endedAtTime "2017-01-18T13:00:00Z"^^xsd:dateTime ; 8 prov:startedAtTime "2017-01-18T10:00:00Z"^^xsd:dateTime ; 9 10 provswprocess:adopts :Test_Cases_Template ; 11 provswprocess:isAssociatedWith :Tester ; 12 provswprocess:isSubActivity :Test 13 . 14 15 :Test_Cases_Template a owl:NamedIndividual , provswprocess:Document_Template . 16 :Payment_Test_Cases a owl:NamedIndividual , provswprocess:Document . 17 :Requirements_Document a owl:NamedIndividual , provswprocess:Document . 18 :Mary a owl:NamedIndividual , provswprocess:Person_Stakeholder . 19 :Tester a owl:NamedIndividual , provswprocess:Stakeholder_Role .
Considering that an artifact can be created, changed, or used, the relation Changed ? represents some modification on an artifact by an activity, considering the SDP retrospective provenance. This relation was not specified in PROV but was created in PROV-SwProcess based on SPO.
IRI:http://purl.org/provswprocess#changed
has domain has range Example 2 ? shows the activity Codification, that changed a software item artifact called Payment_Component, as can be seen in lines 5 and 25.The relation Generated ? (prov:generated) was derived from PROV and, in the context of SDP, represents the creation or production of a new Artifact by an Activity, when considering the SDP retrospective provenance.
IRI:http://www.w3.org/ns/prov#generated
has domain has range Example 3 ? shows in line 4 an example in which the activity Test_Cases_Definition generated the document artifact Payment_Test_Cases.The relation Used ?, in the context of SDP, represents the usage (without any change) of an artifact or a resource by an activity. It was derived from PROV model to represent SDP retrospective provenance.
IRI:http://www.w3.org/ns/prov#used
has domain has range Example 2 ? specifies in line 3 the usage of four artifacts by the activity Codification: (i)Eclipse_IDE, a Software_Product; (ii) Financial_Module, a Software_Item; (iii)Requirements_Document, a Document; and (iv)UML_class_model, a Model.The relation WasAssociatedWith ? is adopted in PROV-SwProcess model to state that a Stakeholder participated in an Activity of the SDP. It was derived from PROV model to represent SDP retrospective provenance.
IRI:http://www.w3.org/ns/prov#wasAssociatedWith
has domain has range Example 3 ? specifies in line 6 that Mary, a Person_Stakeholder, participated in the Test_Cases_Definition activity.The relation WasAttributedTo ? was derived from PROV model, but in the context of SDP it refers to the ascribing of a Software_Process to a Stakeholder, when the Stakeholder was the responsible for the SDP. This relation is part of SDP retrospective provenance.
IRI:http://www.w3.org/ns/prov#wasAttributedTo
has domain has range Example 1 ? shows a Software_Process identified as New_Resource_Development that was assigned to the Stakeholder Simon, using the relation wasAttributedTo (line 11).When only one SDP instance is considered, the relation WasComposedBy ? is used to specify all the activities that composed, in fact, this instance execution.
When there is more than one SDP instance to be analyzed, the relation WasComposedBy ? is inferred, allowing to obtain all the Stakeholders, Resources, Artifacts, and Procedures involved in a specific SDP instance.
This relation was not specified in PROV model. In PROV-SwProcess model, it is part of SDP retrospective provenance.
IRI:http://purl.org/provswprocess#wasComposedBy
has domain has rangeThe relation WasInformedBy ? was derived from PROV and represents the dependency between two activities. It implies that there has been the exchange of some artifact by two activities, one activity using (or changing) some artifact generated by the other activity. This relation is part of SDP retrospective provenance and is inferred by PROV-SwProcess model (for more details about it see Section 4). It is useful only when just one instance is analyzed.
IRI:http://www.w3.org/ns/prov#wasInformedBy
has domain has rangeThe relation WasInformedBy ? can be inferred, for example, to the activity Codification, stating that it was informed by the activity New Resource Specification, considering that Requirements_Document was generated in the activity New Resource Specification (line 4) and was used by the activity Codification (line 16).
1 :New_Resource_Specification 2 a owl:NamedIndividual , prov:Activity ; 3 provswprocess:adopted :Software_Cost_Reduction ; 4 prov:generated :Requirements_Document ; 5 prov:used :Client_Request_Email ; 6 prov:wasAssociatedWith :Client , :Joao , :Support_Team; 7 prov:endedAtTime "2017-01-15T12:00:00Z"^^xsd:dateTime ; 8 prov:startedAtTime "2017-01-14T10:00:00Z"^^xsd:dateTime ; 9 10 provswprocess:generates :Requirements_Document ; 11 provswprocess:precedes :Codification 12 . 13 14 :Codification 15 a owl:NamedIndividual , prov:Activity ; 16 prov:used :Eclipse_IDE , :Financial_Module , :Requirements_Document , :UML_class_model ; 17 prov:wasAssociatedWith :Derek , :Simon ; 18 provswprocess:changed :Payment_Component ; 19 prov:endedAtTime "2017-01-15T18:00:00Z"^^xsd:dateTime ; 20 prov:startedAtTime "2017-01-15T13:00:00Z"^^xsd:dateTime ; 21 22 provswprocess:isAssociatedWith :Programmer ; 23 provswprocess:precedes :Test ; 24 provswprocess:uses :Eclipse_IDE 25 . 26 27 :Software_Cost_Reduction a owl:NamedIndividual , provswprocess:Method . 28 :Requirements_Document a owl:NamedIndividual , provswprocess:Document . 29 :Client_Request_Email a owl:NamedIndividual , provswprocess:Information_Item . 30 :Client a owl:NamedIndividual , provswprocess:Organization_Stakeholder . 31 :Joao a owl:NamedIndividual , provswprocess:Person_Stakeholder . 32 :Support_Team a owl:NamedIndividual , provswprocess:Team_Stakeholder . 33 :Eclipse_IDE a owl:NamedIndividual , provswprocess:Software_Product . 34 :Financial_Module a owl:NamedIndividual , provswprocess:Software_Item . 35 :Requirements_Document a owl:NamedIndividual , provswprocess:Document . 36 :UML_class_model a owl:NamedIndividual , provswprocess:Model . 37 :Derek 38 a owl:NamedIndividual , provswprocess:Person_Stakeholder ; 39 prov:actedOnBehalfOf :Simon ; 40 41 provswprocess:actsOnBehalfOf :Simon ; 42 provswprocess:hasRole :Programmer 43 . 44 :Simon a owl:NamedIndividual , provswprocess:Person_Stakeholder . 45 :Payment_Component a owl:NamedIndividual , provswprocess:Software_Item . 46 :Programmer a owl:NamedIndividual , provswprocess:Stakeholder_Role .
The relation Adopts? represents the need for adoption of some Procedure (a method, a document template, or a technique) by an activity, considering the SDP prospective provenance. This relation was not specified in PROV, considering this model does not address the SDP prospective provenance.
IRI:http://purl.org/provswprocess#adopts
has domain has range Example 3 ? specifies in line 10 that Test_Cases_Template (a Document_Template) should be adopted when the activity Test_Cases_Definition will be executed.The relation Changes ? represents that an activity have to change (modify) an artifact, considering the SDP prospective provenance. This relation was not specified in PROV.
IRI:http://purl.org/provswprocess#changes
has domain has rangeThe following fragment specifies the activity Deploy. In line 8 there is the relation changes, specifying that this activity have to change the Accounting_System, a Software_Product (an specific type of Artifact).
1 :Deploy 2 a owl:NamedIndividual , prov:Activity ; 3 provswprocess:changed :Accounting_System ; 4 prov:wasAssociatedWith :Simon ; 5 prov:endedAtTime "2017-01-21T18:00:00Z"^^xsd:dateTime ; 6 prov:startedAtTime "2017-01-20T10:00:00Z"^^xsd:dateTime ; 7 8 provswprocess:changes :Accounting_System 9 . 10 :Accounting_System a owl:NamedIndividual , provswprocess:Software_Product . 11 :Simon a owl:NamedIndividual , provswprocess:Person_Stakeholder .
The relation Generates ? represents that an activity will generate an artifact, considering the SDP prospective provenance. This relation was not specified in PROV.
IRI:http://purl.org/provswprocess#generates
has domain has range Example 4 ? shows in line 10 an example in which the activity New_Resource_Specification generates the document artifact Requirements_Document.The relation HasResponsible? should be used to express the Software_Process responsible, when considering the SDP prospective provenance. This relation was not specified in PROV.
IRI:http://purl.org/provswprocess#hasResponsible
has domain has range Example 1 ? shows in line 14 that Simon, a Person_Stakeholder, was defined as the responsible for the Software_Process called New_Resource_Development.The relation IsAssociatedWith ? should be adopted to assign a Stakeholder_Role to an Activity, when considering the SDP prospective provenance. This relation was not specified in PROV.
IRI:http://purl.org/provswprocess#isAssociatedWith
has domain has range Example 2 ? shows in line 9 that the activity Codification is associated with a stakeholder role called Programmer.The relation IsComposedBy ? is part of the SDP prospective provenance and indicates all the activities that composes a SDP as a whole. This relation was not specified in PROV.
IRI:http://purl.org/provswprocess#isComposedBy
has domain has range Example 1 ? shows in line 13 all the activities that composes New_Resource_Development Software_Process.The relation IsSubActivity ? indicates that an activity is defined as a sub-activity from another activity, when considering the SDP prospective provenance. This relation was not specified in PROV.
IRI:http://purl.org/provswprocess#isSubActivity
has domain has range Example 3? shows in line 12 that Test_Cases_Definition is a sub-activity from Test activity.The relation Precedes ? was specified in PROV-SwProcess to represent part of the SDP prospective provenance and indicates that one activity precedes another in the process activity flow. This relation was not specified in PROV.
IRI:http://purl.org/provswprocess#precedes
has domain has range Example 4? shows in line 11 that Codification precedes the activity New_Resource_Specification, and in line 23 that Test precedes the activity Codification.The relation Uses ? represents that an activity have to use an artifact or a resource, considering the SDP prospective provenance. This relation was not specified in PROV.
IRI:http://purl.org/provswprocess#uses
has domain has range Example 2 ? shows in line 11 that Codification have to use Eclipse_IDE (a Software_Product artifact).A Stakeholder ? in PROV-SwProcess, as in SPO, represents an agent involved, interested or affected by the software process activities. It can be specialized in other three types: (i) Organization Stakeholder, (ii) Person Stakeholder, and (iii) Team Stakeholder.
IRI:http://purl.org/provswprocess#Stakeholder
is in domain ofAn Organization Stakeholder ? represents organizations involved, interested or affected by the software development processes activities or results. Examples of Organization Stakeholders are: a Project Client or a Quality Assessment Organization. This class does not exist in PROV and was derived from SPO.
IRI:http://purl.org/provswprocess#Organization_Stakeholder
has super-class Example 4? shows in line 30 an Organization_Stakeholder called Client.A Person Stakeholder ? represents persons involved, interested or affected by the software development processes activities or results. Examples of Person Stakeholders are: a hired Programmer, an external Instructor, or a User. This class does not exist in PROV and was derived from SPO.
IRI:http://purl.org/provswprocess#Person_Stakeholder
has super-class In example 5 ?, there is a Person_Stakeholder called Simon in line 11.A Team Stakeholder ? represents teams involved, interested or affected by the software development processes activities or results. Examples of Teams Stakeholders are: the Software Engineering Process Group, a Quality Assurance Team, or a Testing Team. This class does not exist in PROV and was derived from SPO.
IRI:http://purl.org/provswprocess#Team_Stakeholder
has super-class Example 4? shows in line 32 a Team_Stakeholder called Support_Team.A Stakeholder Role ? is related the function performed (or that should be performed) by a stakeholder when an activity was associated with to a stakeholder.
IRI:http://purl.org/provswprocess#Stakeholder_Role
has super-class Example 2 ? shows in line 26 a Programmer, an example of a Stakeholder_Role.The relation ActedOnBehalfOf ? was derived from PROV and, in the context of SDP, represents the assignment of authority and responsibility to a stakeholder (by itself or by another stakeholder) to carry out a specific activity.
IRI:http://www.w3.org/ns/prov#actedOnBehalfOf
has domain has range Example 2? shows in line 19 that Derek acted on behalf of Simon.The relation Created ? represents the creation of a new artifact by a stakeholder. This relation is part of SDP retrospective provenance and is inferred by PROV-SwProcess model (for more details about it see Section 4).
IRI:http://purl.org/provswprocess#created
has domain has rangeThe relation Modified ? represents the alteration of some existing artifact by a stakeholder. This relation is part of SDP retrospective provenance and is inferred by PROV-SwProcess model (for more details about it see Section 4).
IRI:http://purl.org/provswprocess#modified
has domain has rangeThe relation ActsOnBehalfOf ? represents the assignment of authority and responsibility from a stakeholder to itself or to another stakeholder during the process specification, i.e., considering SDP prospective provenance.
IRI:http://purl.org/provswprocess#actsOnBehalfOf
has domain has range Example 2? shows in line 21 that the stakeholder Simon has an authority (or responsibility) over the stakeholder Derek in the analyzed SDP.The relation HasRole ? specifies the role(s) that a stakeholder can perform in the process activities, considering SDP prospective provenance.
IRI:http://purl.org/provswprocess#hasRole
has domain has range Example 2? shows in line 22 that the stakeholder Derek can act as a Programmer, considering the analyzed SDP.A Resource ? in PROV-SwProcess, as in SPO, represents the different types of resources used by the SDP activities. It can be specialized in other two types: (i) Software Resource, and (ii) Hardware Resource.
IRI:http://purl.org/provswprocess#Resource
is in range ofThe following fragment specifies the activity Test. This activity uses a Hardware_Resource called Dell_Inspiron_Intel_Core_i7_8GB_1TB (as can be seen in lines 4 and 13) and a Software_Resource called JUnit5 (as can be seen in lines 4 and 14).
1 :Test 2 a owl:NamedIndividual , prov:Activity ; 3 provswprocess:adopted :white-box_testing ; 4 prov:used :Dell_Inspiron_Intel_Core_i7_8GB_1TB , :JUnit5 , :Payment_Component, :Payment_Test_Cases ; 5 prov:wasAssociatedWith :Mary ; 6 prov:endedAtTime "2017-01-18T18:00:00Z"^^xsd:dateTime ; 7 prov:startedAtTime "2017-01-18T10:00:00Z"^^xsd:dateTime ; 8 9 provswprocess:isAssociatedWith :Tester; 10 provswprocess:precedes :Deploy 11 . 12 :white-box_testing a owl:NamedIndividual , provswprocess:Technique . 13 :Dell_Inspiron_Intel_Core_i7_8GB_1TB a owl:NamedIndividual , provswprocess:Hardware_Resource . 14 :JUnit5 a owl:NamedIndividual , provswprocess:Software_Resource . 15 :Payment_Component a owl:NamedIndividual , provswprocess:Software_Item . 16 :Payment_Test_Cases a owl:NamedIndividual , provswprocess:Document .
A Software Resource ?, as in SPO, is used in PROV-SwProcess to represent the participation of a software product as Resource in a performed activity. An example of a Software Resource is the use of the MS Project in a Project Scheduling activity.
IRI:http://purl.org/provswprocess#Software_Resource
has super-class Example 6 ? shows in line 4 that the activity Test uses JUnit5, a Software_Resource (as specified in line 14).A Hardware Resource ?, as in SPO, is used in PROV-SwProcess to represent a hardware equipment used as resource of some process activity. Examples of Hardware Resources are a Laser Printer or the use of a specific computer by the Project Planning activity.
IRI:http://purl.org/provswprocess#Hardware_Resource
has super-class Example 6 ? shows in line 4 that the activity Test uses Dell_Inspiron_Intel_Core_i7_8GB_1TB, a Hardware_Resource (as specified in line 13).A Procedure ? in PROV-SwProcess, as in SPO, is a normative description prescribing a defined way for performing the software development process activities. As examples of Procedures we can cite a Programming Technique adopted by a coding activity or a OO Method adopted by a conceptual modeling activity. Procedures can be of three types: (i) Method, (ii) Document Template, and (iii) Technique.
IRI:http://purl.org/provswprocess#Procedure
is in domain ofA Method ? is used in PROV-SwProcess to represent a systematic procedure that defines how one or more activities were performed.
IRI:http://purl.org/provswprocess#Method
has super-class Example 4 ? shows in line 27 a Method called Software_Cost_Reduction, that was adopted by the activity New_Resource_Specification.A Document Template ?, as in SPO, is used in PROV-SwProcess to represent a procedure that stablishes "a uniform way for preparing a Document, providing a predefined format and a defined structure for filling it with the required information". As examples of Document Templates we can cite a Project Plan Template, and a Test Report Template.
IRI:http://purl.org/provswprocess#Document_Template
has super-class is in domain ofA Technique?, as in SPO, is used in PROV-SwProcess to represent "a procedure that provides heuristics to perform an activity". Examples of Techniques are pair programming, white-box testing, and brainstorming.
IRI:http://purl.org/provswprocess#Technique
has super-class In example 6 ? there is a Technique called white-box_testing (line 12), that was adopted by the Test activity.The relation WasAppliedTo ? specifies the use of a specific template for creating one or more documents. This relation is part of SDP retrospective provenance and is inferred by PROV-SwProcess model (for more details about it see Section 4).
IRI:http://purl.org/provswprocess#wasAppliedTo
has domain has rangeAn Artifact ? in PROV-SwProcess represent the objects produced, changed, or used in the software development process activities. Artifacts can be of five types: (i) Software_Product, (ii) Software_Item, (iii) Document, (iv) Model, and (v) Information_Item.
IRI:http://purl.org/provswprocess#Artifact
is in domain ofA Software Product ? in PROV-SwProcess, as in SPO, is an artifact representing computer programs ready to be used for supporting software development process activities. As examples of Software Products, we can cite development tools, text editors, and libraries.
IRI:http://purl.org/provswprocess#Software_Product
has super-class Example 5 ? shows in line 10 a Software_Product called Accounting_System. It was changed by the activity Deploy, as shown in line 3.A Software Item? is used in PROV-SwProcess to represent any software item that cannot be classified as a complete software product. As examples of Software Items we can cite a database schema or a specific system component or module.
IRI:http://purl.org/provswprocess#Software_Item
has super-class Example 2 ? shows an example in which the activity Codification used a Software Item called Financial_Module (lines 3 and 14).A Document ? is another possible Artifact type. As examples of Document Artifacts we can cite a requirements specification, or a document describing a the software architecture.
IRI:http://purl.org/provswprocess#Document
has super-class is in range ofA Model ?, as in SPO, is used in PROV-SwProcess to represent an abstraction of a process or system from a particular perspective. As examples of Models we can cite a use case model, a class model, and a component model.
IRI:http://purl.org/provswprocess#Model
has super-class Example 4 ? shows an example in which the activity Codification used a Model called UML_class_model (lines 16 and 36).An Information Item ?, as in SPO, is used in PROV-SwProcess to represent any artifact with relevant information for human use. As examples of Information Items we can cite a bug reported, an agreement e-mail, and a component description.
IRI:http://purl.org/provswprocess#Information_Item
has super-class Example 4 ? shows in line 5 the usage of an Information_Item called Client_Request_Email by the activity New_Resource_Specification.The relation WasBasedOn? specifies that an artifact was based on some procedure during its creation or modification. This relation is part of SDP retrospective provenance and is inferred by PROV-SwProcess model (for more details about it see Section 4).
IRI:http://purl.org/provswprocess#wasBasedOn
has domain has rangeThe relation WasDerivedFrom ? was presented in PROV model, however, in the context of SDP it should be used to express the versioning of Artifacts. This relation is part of SDP retrospective provenance and is inferred by PROV-SwProcess model (for more details about it see Section 4).
IRI:http://www.w3.org/ns/prov#wasDerivedFrom
has domain has rangeThe startedAtTime ? Data Property is derived from PROV and used in PROV-SwProcess to specify the date and time when an activity was started.
IRI:http://www.w3.org/ns/prov#startedAtTime
has domain has rangeThe endedAtTime ? Data Property is derived from PROV and used in PROV-SwProcess to specify the date and time when an activity was ended.
IRI:http://www.w3.org/ns/prov#endedAtTime
has domain has rangeThe generatedAtTime ? Data Property is derived from PROV and used in PROV-SwProcess to specify the date and time when an Artifact or a Procedure were generated.
IRI:http://www.w3.org/ns/prov#generatedAtTime
has domain has rangeThe invalidatedAtTime ? Data Property is derived from PROV and used in PROV-SwProcess to specify the date and time when an Artifact or a Procedure were invalidated.
IRI:http://www.w3.org/ns/prov#invalidatedAtTime
has domain has rangePROV model presents in its documentation some constraints [PROV-Constraints]. They define when a PROV instance is valid, ensuring that this instance represents a consistent history of objects and their interactions are safe to use for logical reasoning or for other types of analysis. Part of this document describes the inferences that may be used on provenance data.
Considering an inference as a rule that can be applied to PROV-SwProcess instances to add new PROV-SwProcess statement, PROV-SwProcess model also specifies its inference rules. In addition to specifying them, they are also implemented in the PROV-SwProcess Ontology.
Fifteen inferences rules have been defined and specified using the Semantic Web Rule Language (SWRL) [HPB+04], specifically to the SDP domain. They can be divided into 7 groups (detailed in the following):
All the proposed inferences have the form:
That means:
After the inference definition, an example of its operation is presented and also how it was implemented in the ontology using SWRL.
This inference states that if an activity ac was associated with a stakeholder sta and this activity ac generated an artifact art, the relation created between the stakeholder sta and the artifact art can be inferred. To express this inference, a restriction using a SWRL rule was created in the ontology:
Figure 4? shows part of the example used to explain PROV-SwProcess model with its possible inferences. Even if there is no explicit and direct relation in the provenance data between Mary and Payment_Test_Cases, we can infer, using the rule presented by Inference 1, that Mary created Payment_Test_Cases.
This inference states that if an activity ac was associated with a stakeholder sta and during this activity ac an artifact art was changed, the relation modified between the stakeholder sta and the artifact art can be inferred. To express this inference, a restriction using a SWRL rule was created in the ontology:
Figure 4? shows that even if there is no explicit and direct relation in the provenance data between Simon and Accounting_System, we can infer, using the rule presented by Inference 2, that Simon modified the Accounting_System.
This inference states that if an activity ac adopted a procedure pro and this same activity ac generated or changed an artifact art, the relation wasBasedOn can be inferred between the artifact art and the procedure pro. To express this inference, two specific restrictions using SWRL rules were created in the ontology:
Figure 4? shows that even if there is no explicit and direct relation in the provenance data between Payment_Test_Cases and Test_Cases_Template, we can infer, using the rules presented by Inference 3, that Payment_Test_Cases wasBasedOn Test_Cases_Template. Another inference of this same type can be seen between Requirements_Document and Software_Cost_Reduction.
This inference states that if an activity ac adopted a document template dt (a specific type of procedure) and this same activity ac generated or changed a document d (a specific type of artifact), the relation wasAppliedTo can be inferred between the document template dt and the document d. To express this inference, two specific restrictions using SWRL rules were created in the ontology:
Figure 4? shows that even if there is no explicit and direct relation in the provenance data between Test_Cases_Template and Payment_Test_Cases, we can infer, using the rules presented by Inference 4, that Test_Cases_Template wasAppliedTo Payment_Test_Cases.
This inference states the derivation between two artifacts if an activity ac has used an artifact art1 and this same activity generates a new artifact art2. To express this inference, a restriction using a SWRL rule was created in the ontology.
When this inference was implemented in the SDP domain, it allowed inferring when an artifact was derived from another, although this relation was not explicit in the provenance data. As can be seen in Figure 4?, we can infer that Payment_Test_Cases wasDerivedFrom Requirements_Document.
This inference states that if an activity ac2 used or changed an artifact art that was generated by an activity ac1, the relation wasInformedBy can be inferred between ac2 and ac1, stating a dependency between these activities. To express this inference, two specific restrictions using SWRL rules were created in the ontology:
Figure 4? shows that even if there is no explicit and direct relation in the provenance data between the activities Codification and New_Resource_Specification, we can infer, using the rules presented by Inference 6, that Codification wasInformedBy New_Resource_Specification.
This inference is only useful when more than one process instance is being analyzed. It brings all the Stakeholders, Resources, Artifacts, and Procedures of a given SDP instance. To express this inference, the following restrictions using SWRL rules were created in the ontology:
Figure 4? shows that even if there is no explicit and direct relation in the provenance data between the SDP New_Resource_Development and all the Stakeholders, Resources, Artifacts, and Procedures, using this inference is possible to obtain a direct association between them.