Archive for the ‘Metadata’ Category

‘Calendar map’ released

Wednesday, July 30th, 2014

The recently released ‘calendar map’, a joint effort of Stanford University data stewardship and IR&DS, illustrates relationships between annual calendars and other time periods at Stanford, including the fiscal year (9/1-8/31), academic year (9/23-8/22, for AY2013-14), and GFS aid year (10/1-9/30).

This and other data stewardship ‘maps’ are available at



Data dictionary migrated to Collibra

Wednesday, July 30th, 2014

As of June 3, our SUDS Data Dictionary has migrated from Confluence to Stanford’s new Collibra-based Data Governance Center.


Initial users are extremely enthusiastic about the extremely powerful content management capabilities that this tool provides.

User resources include the ‘SUDS-ZOO Dictionary‘ sandbox, where users can experiment freely with the tool, without editing live content, as well as documentation compiled in our Collibra Resources community.

SUDS-SPO releases “Award-Agreement Map”

Wednesday, July 30th, 2014

The SUDS-SPO group (Stanford University Data Stewardship – Sponsored Projects) has released a graphical overview of relationships among major entities in sponsored projects, including sponsored proposals, sponsored agreements, sponsored awards, and Oracle awards.


This and other data stewardship ‘maps’ are available at

Progress in metadata standardization (SUDS-SC Meeting 2/18/2014)

Saturday, February 22nd, 2014

In the steering committee meeting on February 18, we reviewed the significant recent progress in standardizing the data dictionary. Contents are becoming significantly more structured, allowing the information to be increasingly useful and flexible. Recent improvements include:

  • Attributes are now all explicitly noted in the Is Attribute Of field; this information is much more reliable than the many ways attribute structures had been tracked before
  • Common core content in Relationships and Discrepancies have been extracted into distinct fields (Is Type Of; Different From)
  • All available details on approval statuses are explicitly noted; the Status field now allows for progress metrics (e.g., IPESIRIS).
  • Types of references to external content has been assessed; citation formats identified and standardized; goals articulated; documentation in development.

Collibra Adoption, Coming Soon!

Monday, January 6th, 2014

We are delighted to announce that Stanford will soon begin using Collibra to manage our business glossary, metadata, and other data governance activities. This will be a significant upgrade from our current metadata repository, supporting our active ongoing metadata development activities and allowing rapid expansion into new areas. Stay tuned for further updates.

Faculty Map Update

Wednesday, December 18th, 2013

An updated version of Stanford’s “faculty map” (Stanford University Faculty and Related Staff Groupings) was released on December 18:

Further information, and other versions of this document, are available at

The faculty map illustrates relationships and properties of Stanford’s Professoriate Faculty, Academic Staff, and Other Teaching Staff, synthesizing information from PeopleSoft, Stanford’s Faculty Handbook, and other sources. The purpose of the faculty map is to facilitate a consistent understanding of these populations in communication and reporting within Stanford as well as externally.

The faculty map is intended as a reference for administrative use only. This document was developed by IR&DS in collaboration with Faculty Affairs, Office of the Provost, University Human Resources, the School of Medicine Office of Academic Affairs, and other Stanford stakeholders.

SUDS-FIN Homework – Payroll and Labor Expense Management – 10/16/2012

Tuesday, October 16th, 2012

Review the definitions that were started during the meeting Tuesday and please provide any feedback through the comments functionality.

We’ll continue to work through the terms in the business questions and work on creating unambiguous definitions.  You can view the current questions (and terms in question) here:

Feel free to begin to critique other definitions and provide feedback through the comments functionality.

SUDS-FIN Minutes – Payroll and Labor Expense Management – 10/16/2012

Tuesday, October 16th, 2012

Attendees: Bryan Brown (FMCS), Rana Glasgal (UHR), Matt Hoying (FMCS), Marissa Lavelle (FMCS), Lillian Lee (IRDS), Nancy Lonhart (Medicine), Jamie Lutton (FMCS), Elaine Moise (FSS), Lily Ng (FMCS), Shawna Powell-Blunt (Payroll), Tim Reuter (OSR), Kurt Staufenberg (PMO), Andy Zell (FMCS)

Thanks to all of those who used the comment functionality on the wiki ( since the last meeting to continue the discussion.  Unfortunately, it is unlikely we’ll be able to discuss all of the terms in the course of the meetings so it is critical that we all find time to continue the discussion online between meetings.

Starting this week, Matt will start sending out the definitions that have been discussed for final approval.  If there are no issues voiced within the following week, the definition status will be updated to “approved.”

These minutes can be found at and additional documentation on today’s discussion can be found at  If you would like to listen to a recording of today’s discussion at


Data Definition Best Practices

Thursday, September 20th, 2012

Stanford DG recently created a draft of data definition best practices for our data stewardship groups.  This is still in draft form so please let Matt know if you have any feedback.

Link to Data Definition Best Practices

Research Data Stewardship – Kickoff Meeting – June 6, 2012

Thursday, June 7th, 2012

Attendees:  Sonia Barragan (RMG), Matt Hoying (Data Governance), Colleen James (RMG), Angel Mayorga (RMG), Kathleen Thompson (RMG)


  1. List major terms that will be considered to be in the scope of the project (all)
  2. Create wiki pages for known terms (Matt):
  3. Schedule weekly meetings (Kathleen)
  4. Send PDF of 2001 definitions to team (Matt):
  5. Research composition of 2001 UMG Working Group on Data (Matt)
  6. Draft information flow diagram (Matt)

This meeting focused on developing the scope, effort duration and deliverables to be produced as part of this effort.  The discussion began with specific examples of the impact of inconsistent/unclear data definitions and other data quality issues.  As data definitions are developed and data quality risks and errors are identified, this group will make a significant effort to document the associated business impact or operational risk.

The definitions developed in the course of this activity will not be considered “Approved” or “Institutional Definitions” until they receive formal sign-off from all necessary business and technical stakeholders.  A formal approval process and executive data stewardship council in this domain will be developed shortly.

An additional key activity will be the communication of these definitions to a broader set of stakeholders.  This training will improve institution-wide understanding of this information, reduce operational risk, and increase trust in the underlying data while providing a valuable source of feedback for the definitions produced by this group.

The focus of the definitions ( will be on the major data entities associated with the lifecycle of a proposal and award.  Using the sample lifecycle ( produced by RMG two weeks ago, the team will work on listing the concepts and terms that will be defined as part of this effort.  In this phase we will not be focusing on defining all of the attributes of these entities or fully describing the details/derivations of the entity subtypes.  The aim will be to make this an eight to ten week effort with weekly meetings of an hour and a half.  Kathleen will schedule the next meeting, most likely, next Thursday, June 14th in the early afternoon.

In addition to this list of terms, the team will work to develop a graphical timeline of the proposal and award process, pointing out significant state changes and key dates.

The final deliverable of this short project will be a high-level information flow and CRUD (Create, Read, Update and Delete) matrix that displays where key types of data reside and the activities at those locations.  Matt will be sending out a draft diagram to the team shortly to use as a starting point.

Finally, at the end of the meeting, the team developed a draft definition for the term “Proposal.”  This has been posted on the Data Governance Wiki and all team members are encouraged to further refine the definition on the wiki or make comments regarding the fitness of the definition.  Please also feel free to share the draft definition (on the wiki) with other subject matter experts and get their perspective.

Defined Terms:

Proposal: A proposal is a formal funding request on behalf of the University for external funding to support a scope of work defined as a Sponsored Project.

In the course of daily activities, if any of the team members come across data quality issues, opportunities or develop out-of-scope definitions, please forward them to Matt for compilation.  Please reach out to Matt with any questions, corrections or additional information about this subject.