The public guidelines for curation are listed below. If you are a curator for SciCrunch or the Neuroscience Information Framework please contact an administrator or supervisor for access to the Curation Wiki on GitHub.
Curation Guide for SciCrunch Registry
SciCrunch Registry Resource (Overview)
The SciCrunch Registry, a core resource of SciCrunch, is a catalog of web resources that have been selected by SciCrunch curators, or contributed by the community, as valuable tools for researchers and students in the fields of neuroscience and biomedecine. The SciCrunch Registry contains a listing of a variety of resources including databases, software tools, brain atlases, granting agencies, and tissue banks. Resources are continually added and updated by SciCrunch's staff, affiliates, and users who recommend their resources to the Registry.
The SciCrunch Registry uses SciCrunch vocabularies to provide high level descriptions of the nature of the resource and its contents. However, unless the resource is a database or data set and has been registered with the SciCrunch data integration tools, the SciCrunch Registry does not search the contents of these databases directly. For example, searching for global key words such as "genes" or "tissue bank" will bring up the various resources that have those descriptors, whereas "GRM1" or "C57BL/6J-rcw3J/J" will not bring up results, as the specific gene name or strain names are not tagged for each resource. The SciCrunch Registry is a place where there is a list of Alzheimer's disease tissue banks, but it will not tell the user which types of tissues are found in each tissue bank. This type of "drill down" search is provided for a subset of databases through SciCrunch's Data Federation.
Community involvement is encouraged. Anyone may add a new resource, or edit existing resources if they have an account. All additions and edits are curated by SciCrunch staff to comply with SciCrunch standards and policies. Part of this process involves emailing the resource creator in order to ensure its viability. This also serves to establish a professional relationship with the content creator.
As a general rule, users should register a root/individual resource and not a subset of that root/individual resource. It shall be considered an individual resource if it is maintained by a single entity, and has the properties of one or more individual web pages that are related by a theme and HTML links. Most often the individual pages share portions of the URL, however, unrelated URLs may be incorporated into a single web resource as alternate URLs. In the event that a subgroup of pages represents a sufficient shift in theme, it should be classified as an independent resource. For example, the department of neuroscience of a university (resource 1) may have a lab led by a researcher (resource 2).
Registering a resource to the SciCrunch Registry is the simplest form of registration. The registration form asks for the name of the resource, URL, and some additional basic information, including resource type(s).
Once approved, the resource is immediately assigned a SciCrunch ID. It will also be included in the SciCrunch Registry (updated weekly every Friday), where it is available through direct query (through SciCrunch's search results) with links back to the original source. Please note that there may be some result discrepancies in regards to curation mode as of 12/5/17 (a GitHub ticket has been noted in regards to this discrepancy). Using the exact SciCrunch ID for your search will help with more accurate results.
Anyone, whether it is the resource owner or not, may register any non-commercial (exceptions apply) neuroscience-related resource. If you are the resource owner you may claim yourself as owner when creating it on SciCrunch, as well as add the “Registered with SciCrunch” icon to your site.
New resources added to the registry and curated should also be tweeted about. Curators should send an email or a note over the discussion board to the registry’s social media manager.
After a resource is registered as a SciCrunch Registry resource, resource owners may create a sitemap or register their resource as part of the Data Federation to provide direct access to dynamic content – see Sitemap and Data Federation below.
What resources are included in the SciCrunch Registry?
The SciCrunch Registry is not exclusive to any one type of resource. Rather, it contains myriad resources that are deemed valuable to the biomedical community. Most of these are freely available on the Web, although some are restricted to a small community of users due to commercial interests, or laws governing the sharing of sensitive data. We are relying on feedback from the community as to what types of resources they would like to see. For example, would the community like to see more commercial resources? Should we be including well known general resources such as GenBank and NCBI? Should we be listing journals and scientific organizations? The rule of thumb for SciCrunch staff is that if the resource is conceivably useful to some biomedical researchers, then it should be included.
Some classes of resources, such as electronic journals, are generally not included in the SciCrunch Registry. In general, SciCrunch is not yet ready to include certain kinds of resources, although exceptions have been made. Journals are already searched in the Literature section, so providing additional access seems redundant unless there is some specific reason for doing so, such as a database of supplemental materials published in the journal. Literature searches typically do not search these types of materials adequately, and so they are valuable additions to SciCrunch.
SciCrunch places a high priority on resources that are recommended by their owners, so these resources are typically included in the SciCrunch Registry relatively quickly. To register a resource, please visit the SciCrunch Resource page. All pending resources are reviewed by SciCrunch curators.
What makes a good resource?
A good resource is one that is determined to be of value to the greater biomedical community. SciCrunch is only interested in resources the user can interact with, i.e., resources that the public can take and do something with, such as a database, tool or service. However, we currently include departments and labs within universities. This inclusion challenges our definition of "user interaction," but these resources are still essential to the way we are building and curating our ontology.
Other traits of a good resource:
- Non-profit: Preferably, all resources are open source or available to the public with no additional monetary motive. However, there are many exceptions that have been made for commercial resources that are of high value to the biomedical community. If a resource is commercial, SciCrunch needs to view the resource very carefully since the goals of SciCrunch and the commercial resource may not be aligned.
- Access: Is the resource accessible by the community at large or only individuals who are at a particular institution? At this time, we will not be cataloging institution-specific resources, but only those that may be used by the community at large.
- Richness of resource: The resource needs to be functional, i.e., we will not be listing resources that are still developing and have no product available. These sites may be listed in the registry, and tagged for future curation.
- SciCrunch is not a patient targeted resource: SciCrunch should not concern itself with making sure that all clinical trials are registered, as these are in ClinicalTrials.gov, or adding blogs from a patient's (or relatives of patient's) perspective. SciCrunch should include early phase clinical trials as SciCrunch resources, if they have methods sections that are published (therefore providing information to other). The terms "experimental protocol" and " clinical" should be included as keywords and the term "Human" should be in the species field. These resources should not be tagged as a clinical knowledge base in the Keywords field if they do not provide data.
- Journals: These should not be added to the Registry as we have a literature section.
Adding a SciCrunch Registry Resource
To add a resource to the SciCrunch Registry, go to the "Resources" tab on SciCrunch.org and click on the button "Create a Resource".
Open image in new tab for clear view.
Step 1: Type the name of the resource to check if an RRID already exists for it.
Step 2: Choose your submission type as "Resource" or "Organization" to proceed. "Resource" will provide different informational fields from "Organization", so be sure to pick the right submission type.
Users are required to fill out a simple form with facts about the resource. The required fields to are a short description of the resource and the URL. Additional fields are requested, and we encourage the community to do the best they can to fill these out. All fields may not be applicable, e.g., a software resource may not have an associated organism. The better the record is filled out the more it benefits the user/owner.
Remember that the goal of these entries is to be used for search across resources. Information, as far as possible, must be machine readable and human readable. Therefore, do not just copy terms, but phrase fields so that they are machine readable and human readable. Two examples will illustrate:
- Input the grant information using the "Funding Information" field. Put the title of the organization (full name if it's a lesser known organization, but abbreviations are fine for NIH, NIDDK, etc.) and the grant number if provided. Organization titles without numbers can be submitted just once, but when including grant numbers, each one must be added separately.
For example, MRC may be added just once. But if instead you have two MRC grants, they must be added as MRC|0123456789 and MRC|9876543210.
- Keywords: longitudinal, fasciculus, lf, medulla, oblongata, mo, etc.
Generally speaking, all new resources will be curated by a SciCrunch Curator within 7 days.
Other methods of resource nomination include emailing email@example.com, or notifying SciCrunch staff directly.
Curators should always be logged in when adding new resources/editing.
Naming a Resource
Generally speaking, try to name resources just as they are presented on the website with regards to capitalization and spacing e.g., "PubMed" would not be "Pubmed", or "Pub Med".
Additional exceptions include:
- If a resource begins with "The" drop it when naming the resource unless the abbreviation incorporates the T.
- For resources with common/general names e.g., "Alzheimer's Disease Center" include the associated University: "Boston University Alzheimer's Disease Center" in the title. E.g., Gene --> NCBI Gene. Do not name resources "Department of Pharmacology" as many universities have one. Likewise, do not have a generic abbreviation, e.g., DOP; rather, use the one provided along with the school name abbrev.
- Do not make up names, synonyms or abbreviations unless you feel these alternate names will be searched for to find the resource.
- For all university department sites here is the template: University School Department e.g., University of California San Diego School of Medicine Department of Pharmacology.
- Do not use colons, quotation marks, @, brackets or ampersands in a resource name as these create problems, including resources being unable to be tagged "curated". Synonyms can be used to circumvent this problem. (@ and brackets () cannot be used at all in any of the forms fields)
- Avoid special characters like "/", ":"as no one will query for them.
- Do not use version numbers as these will change over time
- Avoid extra description in the title. Remember, these will be used for search and for alphabetizing. The name of the resource should be the common name:
- Yes: Flybrain - No: FlyBrain - An Online Atlas and Database of the Drosophila Nervous System.
Requested Resource Form Fields (Basic tab)
The description should be one to two sentences. It can often be paraphrased or copied from the "about us" section of the resource or from its home page.
First sentence: Noun with basic core function. EX: Source code that performs multiple alignment of peptidic sequences.
Note for first sentence: The first word(s) should generally be a statement of the classification used by the SciCrunch curator for resource type. If the resource is classified as a data set, the first line should read "Data set that...". If it is a software application, it should read "Software application..." These may be changed slightly for grammatical or readability issues, but it is good practice that the human readable definition and the machine readable definition should be the same.
Second sentence: Concise expansion on core function. EX: It utilizes posterior decoding and a sequence-annealing alignment, instead of the traditional progressive alignment method.
While it is tempting to copy the description of the resource verbatim from the web site, please do not do this indiscriminately. Rather, turn them into informative, pithy, machine-readable resource descriptions. These descriptions will be displayed as snippets by many tools that access the resource registry. Thus, the first line of the description should be as informative as possible.
Let's say we wanted to add a resource that is called: Cow brain gene expression atlas
Good leading sentence: Atlas detailing the three dimensional expression of 20,000 genes across major regions of the cow brain.
Bad leading sentence: The Cow Brain Gene Expression Atlas was developed by the University of X and aims to provide an increased understanding of ..."
The following are guidelines and best practices for reviewing and writing resource descriptions:
Include variations of the name that are used in the website or associated paper. Save abbreviations for the Abbreviation field.
Include information related to the resource's funding information (supporting agency and funding support). Separate multiple grants by a comma. Format-Supporting Agency+ space+funding support number, e.g., Office of the Director NIH 000000000, Contract HHSN27120080035C
Look for supporting agency(s) on the website. This will often be available at the bottom of the page or in an acknowledgements section. When this information is not found on the website, it can often be obtained from a paper(s) about the resource. Papers that describe the resource can often be obtained by searching PubMed for the name of the resource. Verify the paper is describing the resource not just mentioning it. The information can be found in the "Acknowledgements" or "Funding" section.
Look for the grants(s) funding the resource on the website. This information will often be available at the bottom of the page or in an acknowledgements section. When this information is not found on the website, it can often be obtained from a paper(s) about the resource. Papers that describe the resource can often be obtained by searching PubMed for the name of the resource. Verify the paper is describing the resource not just mentioning it. The information can be found in the "Acknowledgements" or "Funding" section. Contracts should be listed in this field too, just add the word “Contract” beforehand,
Information, as far as possible, must be machine readable and human readable. Therefore, do not just copy terms, but curate them so that they are machine readable and human readable. "Contract #s N01-HD02-3343, N01-MH9-0002, N01-NS-9- 2314, -2315, -2316, -2317, -2319, -2320" was entered into the "supported by" field. Note, that if I were looking for N01-NS-9-2316, I would get zero results. A human knows what that list means but to get the computer to know requires additional programming. Curate by supplying the complete grant number.
This field is for the user to submit any comments about the submitted resource.
For Biospecimen resources, please add the available Sample types to this field in the following format: Sample type: Blood, DNA, Urine, Cell, etc. Please add this information as keywords too.
This is field is for the curator to make any additional comments about this resource.
Additional Resource Types
Additional resource types have been created to provide a standardized method of classifying resources. Resources should be labeled with the main thing(s) that the resource offers using terms within the Resource Type Hierarchy
What is the primary product offered at this web address? Software tool? A data set? A service? That is, what would a user expect to take away from this site? Avoid assigning resource types for very minor functions. For example, if a site offering a database on nucleolar proteins has a discussion tab where they advertise a position for hire, do not characterize this resource as a job resource. A user should always understand why he or she was taken to a site, i.e., they shouldn't have to dig for information-it should be obvious. You may use the keywords to add additional resource descriptors if you think they are highly relevant.
The goal of resource type categorization is separate from keywords or other properties in that the resource type should inform the user as to the central purpose of the resource and not the particulars. A resource type is the “product” that is offered. For example, MGI is a database of mouse genes and is labeled as a "Database;" the Mutant Mouse Regional Resource Center accepts and distributes mutant mice and is labeled as an "organism repository;” and the Michael J. Fox Foundation for Parkinson's Research funds grants so it is labeled a "Funding resource."
To do this categorization, we created an internal consensus resource descriptor list based on the interactions with the library community and the BRO resource ontology, as well as several NIF partners.
The NIF Resource module was created within the NIFSTD ontology as a separate module. This module fixes the set of high level categories, adding classes like "Service resource", and also attempts to harmonize with the Biomedical Resource Ontology (BRO), NITRC resource types and OBI classes.
All of the individual resource types currently fall into at least 1 of the 8 major categories below, and the user may search by these categories,
A complete up-to-date listing of the Resource Type Hierarchy and their definitions is available through NeuroLex. An alternate view can be found through Bioportal's NIF Resource Type Hierarchy View.
These resource descriptors are meant to narrow search results by the type of thing that a neuroscientist is looking for. We believe that they are useful as general categories because they are in common English and tend to be understood by Neuroscientists quickly. The question that is to be answered is "what is the end user looking for?" For example, if the user is looking for a transgenic mouse, they should not be bombarded with software tools that hit the same keywords or data sources that talk about the mouse.h3.
The resource should be tagged with all applicable resource types, but not resource types that pertain to sub-resources if they will become separate SciCrunch resources. In general, if you have to assign too many labels, you are probably better off creating separate pages for some of the tools, rather than trying to characterize everything a particular resource has to offer in total. In general, the trend in SciCrunch has been to use less granular resource types to simplify choices by the user. Thus, we now favor just “database” over “web-accessible database”. Any additional characteristics can be covered by the keywords.
Assigning resource type or types can be challenging, as many websites offer multiple products and an individual product can serve multiple roles. All resources are curated by the SciCrunch curator, so do not be concerned if you have difficulty.
Many times, portal sites have a lot of valuable resources that aren't apparent from the home page. In this case, SciCrunch has to decide whether to create a separate entry for the resource or tag the general resource with a bunch of tags. One of our guiding principles is that the user should know why they are taken to a site. For example, an organization claims that it has a training program for Ph.D. students, but it takes the SciCrunch curator significant time to find out where on the site this information is listed. That page may not be particularly useful without going through the home page of the organization. In this case, SciCrunch would tag the organization home page with "Graduate program", but would include in the description that such a program is offered and how users can find out about it within the resource. In contrast, a model organism database may have an ontology that is available through their home page, but is difficult to find. As the ontology page can be considered a self-contained resource, that is, you don't need to read the home page to understand it, SciCrunch would list the ontology as a separate resource.
E.g., just because a resource has images, it does not mean you should tag it with image. You would only tag it with image if that was one of the main things the resource offers. Image can always be added as a keyword for these types of scenarios.
E.g., Model organism databases such as Xenbase should only be tagged as a Database and Repository. You can add other resource types such as Organism-related portal, Data analysis service and Organism supplier as keywords.
E.g., Databases that offer data analysis services such as BLAST should only be marked as a database - not data analysis service. This can be added as a keyword.
Multiple entries are to be separated by a comma. Anytime you get more than 3 resource types begin thinking about breaking the resource up into more resources.
Keywords and resource types will be treated in a special way by the SciCrunch search systems allowing them to be ranked higher than other search results.
Is Listed by
A resource is listed by another resource. Must be in the SciCrunch database.
Ex.) CiteULike is a digital tool listed by Connected Researchers.
A resource lists another resource. Must be in the SciCrunch database.
Ex.) Connected Researchers lists CiteULike as a digital tool.
Is Used by
A resource is used by another resource. Must be in the SciCrunch database. Data or information may be pulled from this resource. Must be in the SciCrunch database.
Ex.) KEGG and its pathways are used by METLIN.
A resource uses another resource to fulfill its purpose. Typically in reference to tools, but can also be a category for information used by another resource. Must be in the SciCrunch database.
Ex.) Oncogenomic Database of Hepatocellular Carcinoma uses information from Ensembl.
Is Recommended by
A resource is recommended by another resource. Must be in the SciCrunch database.
A resource recommends another resource. Must be in the SciCrunch database.
State the availability of the resource/licensing information. E.g., if the resource is a biobank, can anyone request biomaterials? Is it public, open source, freely available but must cite, freely available to academic institutions, etc.? If available, this information can be obtained either from the website or the related article.
When the information is available, the field should cover access of the resource: can you add to the resource, can you take from the resource / what are the terms, and is the resource still available? If the resource is no longer available, please add the following to this field as well as to the top of the description: THIS RESOURCE IS NO LONGER IN SERVICE, documented on ‘full month’ ‘day’, full year. (e.g., THIS RESOURCE IS NO LONGER IN SERVICE, documented on September 05, 2013.). When adding this to the Availability field, remove all punctuation so that it is recognized as one statement by the computer. If the user can add to the resource, add "The community can contribute to this resource" in the field.
Separate each standardized entity with a comma (delimiter for the wiki). For example, an Availability field for a given resource may look like this: Open unspecified license, Acknowledgement requested, THIS RESOURCE IS NO LONGER IN SERVICE documented on September 05 2013, The community can contribute to this resource.
For a full list see Availability values in the NeuroLex.
The URL(s) where the resource posts under which conditions you may use the resource. This can include other titles, e.g., copyright page, citation policy, policy, terms and conditions, etc., (comma separated for multiple entries). This field should not include the url of the actual license. (The actual license has its own entry with associated url (s)).
Other/Alternate URLs that retrieve the source (comma separated for multiple entries).
URLs that used to retrieve the source but no longer work (comma separated for multiple entries).
This field does not need to be altered.
Owner-specified ID. This will be the RRID if it exists. Otherwise, the “SCR_”-prefixed ID is the RRID. Only owners and curators can add canonical IDs.
This field was created primarily for the biospecimen resources to state if the bioresources were to be used for research, transplantation, therapy, education etc.
Multiple entries are to be separated by a comma.
If the data resource concerns a disease, set of diseases, or condition, make sure that they are stated (e.g., Parkinson’s disease, neurodegenerative disorder, Batten’s disease, Aging, Normal control, etc.)
Additional entries are to be separated by a comma.
The physical location that the resource is located in, if known.
This field is mainly used for biobanks and what sort of processing the biospecimens have been put through, e.g., Frozen, paraffin, slide, cryopreserved, stained, fresh, etc.
Separate multiple entries by a comma.
Add organisms represented in the resource, E.g., if the resource is a database of mouse gene expression, add “mouse” to this field. Not all resources will have an associated organism; e.g., like many software resources. Some resources are not forthcoming with this information. For instance, if the database is a clinical trial database, make sure that it is labeled “human”. Many resources mention the organism(s) in the description, but as some do not, it becomes very important to capture this information.
Multiple organisms may be added; just separate them with commas.
The organism's age(s) should be classified using the NIF annotation standards for age classification, and this should be added to the species and keywords fields, e.g., Late adult human, Embryonic Mouse as it includes more information than just the organism (the age).
This is the primary resource type. Entries will either be a resource, commercial, an institution or a university.
Occasionally there is a PDF or other publication that is about the resource. You may also link directly to the paper, even if it is in PubMed. This field will only accept one URL. Please include the full URL that includes the http:// (or the like) part for the link to work.
Add the social url for this resource, for example FaceBook, Google+ and WordPress.
This field is dedicated to the terms under which the content is made available. Place a short description about the licenses used.
Only the twitter handle should be used for this field. For example, @Twitter::neuinfo. The twitter handle “neuinfo” should be used.
Scope of Curation for SciCrunch Registry Resources
The goal of curation is to establish a set of identifiers that will help the end user find relevant resources, but not overwhelm the user.
Here is a concrete example, a user was looking in the SciCrunch Registry for any resources that were annotated with the term "Locus Ceruleus" and the CNS Forum was returned. There is no mention of "Locus Ceruleus" on any page within the CNS Forum, but one of its subcomponents is called brain explorer. This feature contains a set of images of brain regions that were pulled by curators and annotated. Thus, the main site CNS Forum was returned for the "Locus Ceruleus" query. This addition of annotation is not helpful, but confusing in this case because to find any mention of the "Locus Ceruleus" the user would need to navigate down four link levels from the main page to a list of brain structures. Most users would not do this and simply believe that the result was an error. Therefore, the annotation should be narrow enough that it captures the main features of the site, but not information that is too deep within the site to easily find. For resources with deep structured content, consider exposing them through the SciCrunch data federation.
Another case that is difficult to assess is the case of protein or gene databases. It is often possible to obtain a so called "data dump" of the individual records from a database and one possible curation method is to take the data dump, strip the tags and place a cleaned list of terms in the registry file. Thus, a database registered in this way would always return if the end user queried for any of the proteins or genes within the data dump. Several problems arise with this strategy, including updating of information and also preferential treatment in search of databases that can dump data.
To address the first issue of updating the information, a single data dump will create a snapshot of the data as it was when the data dump occurred. This may be a good idea for relatively static web entities, such as an atlas from an individual experiment. Data will not be added to the atlas, but it is a good reference resource. However, most scientific databases are not static entities, for example GENSAT is updated daily at 6AM EST. Therefore to stay current with new developments any data dump would need to be done with a frequency of the newly available data. The ability to accomplish this task manually on a daily frequency is not a reasonable expectation of a human curator, rather it is more amenable to an automated program. So any web resource that has a significant and changing component should be annotated generally and added as a possible level 2/3 resource candidate.
The second problem that arises with a data dump model is the preferential finding of databases that allow their contents to be easily dumped. The contents of the PubChem or UniProt databases may be too vast for a human to easily parse them, so these sites tend to be left out of the data dump class, but they are more likely to contain any protein data than easily parsed databases like KARG (with only thousands of entries). Again this creates a problem in searching for data, because while PubChem is certain to have relevant data to the query, a smaller database will come up preferentially because its data has been dumped and parsed.
Thus, the scope of annotation should be relatively superficial for level 1 resources, and also should be consistent in scope.
The Antibody Registry Curation Standards
The Antibody Registy is a database of antibody records that have been contributed by the community, or gathered and verified by SciCrunch curators, as valuable, unique antibodies for researchers and students in the field of biomedicine. The Antibody Registry contains a listing of a vast number of unique antibody records. Records are continually added and updated by SciCrunch’s staff, affiliates, and users who have accounts with the Registry.
Community involvement is encouraged. Anyone may add a new antibody record if they have an account. All additions are curated by SciCrunch staff to verify their originality and authenticity, as well as to add any missing information.
Once an antibody entry is added to the Antibody Registry, the record is immediately assigned a Research Resource Identifier (RRID). It will also be included in the Antibody Registry if it is found to be a unique antibody (updated every 8 days), where it is available through direct query (through the Antibody Registry’s search results).
What makes a good entry?
A good entry is one that is determined to be a unique, machine searchable, antibody record. The Antibody Registry is only interested in entries with these aforementioned qualities because its goal is to make biomedical research methods more transparent and rigorous.
Adding an Antibody Registry entry
To add an entry to the Antibody Registry, click on the “Add” tab on AntibodyRegistry.org. Choose your submission type as “Commercial Antibody” or “Personal Antibody” to proceed. "Commercial Antibody" will provide slightly different required informational fields from "Personal Antibody", so be sure to pick the right submission type. Most importantly, please remember to first check to see if the antibody is already registered in the system. This can be done by searching for the antibody using its catalog number or clone ID on the Antibody Registry website.
Users are required to fill out a simple form with facts about the antibody. The required fields are the "Catalog Number" and the "Provider Website" for a commercial antibody, and the "Antibody Name", "Raised in Species", "Provider Name/Institution", "AB Target", "Internal Label", "Clonality", "Provider Website", and "Defining Citation" for a personal antibody. Additional fields are requested, and we encourage the community to do the best they can to fill these out. All fields may not be applicable e.g., a polyclonal antibody will not have a Clone ID. The better the record is filled out, the more it benefits the users/vendors. Antibody submitters are encouraged to contact the registry’s curator to keep the resource's content up-to-date; e.g., if the antibody now is listed in a newly published article, send an email requesting to add the citation to the defining citation field.
Remember that the goal of these entries is to be used for search across the database. Information, as far as possible, must be both machine readable and human readable. Therefore, do not just copy terms, but phrase fields so that they are machine readable; e.g., avoid starting any section with a space.
"ALX-805-504" was entered into the "Catalog Number" field. Note, that if I were looking for ALX-805-504-5001, I would get zero results. A human knows what that catalog number means but to get the computer to know requires additional programming. Curate by supplying the complete catalog number as seen on the vendor’s website.
Due to increased levels of spam, the ability to create new antibodies has been restricted to only allow logged in users. Therefore you will need to create an account (top right of antibodyregistry.org), / Login to the Antibody Registry, prior to adding a new resource.
Users and curators are required to always be logged in when adding new resources/editing.
Remember that all entries will be curated by SciCrunch staff, so it is not imperative that you get everything correct.
Requested Resource Form Fields
The catalog number should be added exactly as it appears on the vendor website for commercial antibodies.
Do not attempt to truncate the catalog number as this could result in duplicate records or an incorrectly assigned RRID. Make sure to include any dashes or spaces, e.g., -, if they are included in the catalog number as seen on the vendor website.
The catalog number should have no special characters recorded, e.g., #, $, /. This is done to ensure that the catalog number is machine readable.
Add the URL of the entry including the http:// portion, i.e., adding a URL such as www.neuinfo.org will not link. You must provide the full URL, http://www.neuinfo.org/.
The URL should only include the necessary portion required to function, i.e., use http://neuromorpho.org/neuroMorpho/ rather than http://neuromorpho.org/neuroMorpho/index.jsp. Use https://www.niddkrepository.org/ rather than https://www.niddkrepository.org/home/. That said, please verify that the trivial portion of the URL is indeed not needed. Surprisingly, some URLs will not work without these.
The URL should link directly to the antibody’s product page, i.e., adding https://www.cellsignal.com/products/primary-antibodies/14-3-3-t-antibody/9638?N=102236+4294956287&fromPage=plp rather than https://www.cellsignal.com/ to avoid any confusion about the antibody’s identity.
Add the name of the vendor or researcher name/institution that provided the individual antibody. If the name of the provider appears in the suggested list below the “Provider Name/Institution” search bar, then please select it to avoid creating a duplicate vendor.
Use the English version of any organization name(s) if it is readily available and for institutions state the country where the organization is/are located, and the laboratories therein.
Avoid the use of commas when entering the name of the vendor or researcher/institution if at all possible.
If there are multiple names that a vendor or researcher/institution can be labeled as, then the curator will add these synonyms to the entry after it has been submitted by the user. These names should be separated through the use of commas.
Add the name given by the vendor on the site. If there is no name given or the antibody is a “personal antibody” then create a name by adding together the antibody target, the clone ID in parenthesis, the species the antibody was raised in, and it’s clonality; e.g., if an antibody targeted the Rab5 protein, had a clone ID of C8B1, was raised in a rabbit, and was monoclonal, then the antibody's name would be “Rab5 (C8B1) Rabbit mAb”.
If some of this information is unavailable or unknown, then simply fill in the known information when possible.
Raised in Species
Add the organism represented as the host species of the antibody, e.g., “rabbit”.
Add organisms represented as the target species of the antibody, e.g., if the antibody targets proteins in mice, add "Mouse" to this field. Some antibodies are not forthcoming with this information. For instance, if an antibody is used in a clinical trial, make sure that it is labeled "Human."
This information is labeled as “Species Reactivity” on many vendors’ websites.
Multiple organisms may be added; just separate them with commas.
The organism's age(s) should be classified using the NIF annotation standards for age classification, and this should be added to the species and keywords fields, e.g., Late adult human, Embryonic mouse as it includes more information than just the organism (the age).
Add the name of the target of the antibody. This is usually a protein, e.g., “myosin”.
This section is only requested for “Personal Antibodies”. Add an identifier unique to the user's lab; e.g., “Labname_001” or “mylab_1023”.
Add the clonality of the antibody, e.g., “monoclonal”, “polyclonal”, “oligoclonal”, or “unknown”.
PubMed ID's from papers about the antibody should be added to this field.
The PubMed ID field can be obtained from the website, when available, or by searching PubMed for the antibody.
Multiple id's should be separated by a comma but only the first entry will be linked to PubMed.
If the resource does not have a PMID number, use the DOI. Separate each paper entry by a comma. Format example: “25018728, 23195120” or “DOI 10.1111/iwj.12345”
Add the target clone ID of the antibody in the “Clone ID” section. This can usually be obtained from the vendor’s website or can sometimes be found in the antibody’s name in parenthesis.
The clone ID should be entered exactly how it is found, e.g., “F1.652”.
Add the isotype of the antibody; e.g., if the antibody is an IgG isotype, then add "IgG" to the product isotype section.
Make sure to use the proper capitalization for the isotype and to be as specific as possible when documenting the isotype.
Add the conjugate of the antibody to the product conjugate section e.g., “unconjugated antibody”.
Add the form of the antibody to the product form section e.g., “liquid”.
Uniprot ID's for the antibody should be added to this field.
The Uniprot ID field can be obtained from the website, when available, or can sometimes be obtained from the vendor’s website.
Add the amino acids that make up the specific antibody binding site in this field. This is usually a set of 5 or more amino acids, e.g., “DYKDDDDKC”.
This field is for the user to submit any comments about the submitted entry including the recommended application of this antibody.
The curator should submit any comments about changes made to the antibody information, reasons for rejection, consolidation with other antibodies registered with the Antibody Registry, and validation for images on the material data sheet, which should be uploaded to the Antibody Registry either by the user or the curator.
Scope of Curation for Antibody Registry Resources
The goal of curation is to establish a unique set of antibody identifies that will help the end user find relevant records, but not overwhelm the user.
Data Ingestion Workflow and Quality Control for Data Federation:
Data Ingestion Workflow
Data Ingestion Workflow
The process to add a new resource to Data Federation requires the following sequence of steps.
1. Registering a database
The new resource must first be added to the SciCrunch Registry.
2. Import data
Data from the new resource is added through DISCO.
3. Curation or view building
The new data is curated into a view table using the concept mapper tool.
After the view is built, it is deployed onto the beta website for review. An email should be sent out informing other curators, and if applicable, the data owner, of the new view and providing the specific link to the beta dataset to review. The curator should wait about 3 to 4 days for other people to provide feedback and make necessary changes.
Each newly curated view is approved for production by a curator. The process of deploying a new view to production (which is done by IT personnel) occurs on Saturday evenings, so the new data should be available on Monday mornings.
6. Make a default snippet
Each new view needs a snippet. Curators can edit the default snippets under My Account -> Edit SciCrunch -> Sources -> (or here). When making the snippet, make sure to test the snippet.
7. Posting to Social Media
SciCrunch Data Ingestion Quality Control
Prior to releasing a new data view to production, it is important to scrutinize the view and all its components as a user would. A Data Ingestion Checklist https://docs.google.com/spreadsheets/d/1oV4R9-Tz5XgG9ED_t2FPh7Dz-W9tzPYYJo_V7qWiiRM/edit#gid=0 has been assembled to assist with this and now includes a Quality Control section. Once all the checks are complete, firstname.lastname@example.org should be notified. The curator should wait about 3 to 4 days for other people to provide feedback and make necessary changes.