Recommended Ternminologies and Example Values¶
Beyond single beacons or managed beacon networks the use of common, widely supported
and standardized query values is of special importance to empower federated queries.
The main examples relevant to variation queries here are referenceName
and
variantType
.
Value Definitions in the Beacon Schemas
In most of its schemas the Beacon v2 specification is not prescriptive in the specific values permitted for individual parameters and provides a number of legacy values as examples in its inline documentation. This was thought to promote flexibility (e.g. non-human use cases) and simple adoption (e.g. reusing VCF terms).
Beacon Scouts To Do
We will add and document recommended termionologies and use case examples during the 2024/2025 Beacon Variation Scouts process.
VRSification and value definitions¶
Due to the adoption of VRS concepts already recommendations for some values can be provided.
referenceName
(v1/v2) and referenceSequence
(v2plus/VQS)¶
Reference names (referenceName
) in Beacon v1/v2 allow versioned and unversioned
formats and use an assemblyId
to specify the reference genome.
Recommendation¶
- use of un-prefixed chromosome names (e.g.
1
,17
,X
) for human genomes together with anassemblyId
(e.g.GRCh38
), OR - use of prefixed Refseq chromosome names (e.g.
refseq:NC_000001.11
)
From experience with current practices the use of option 1 seems more widespread and - while implementations should be able to disambiguate and remap - recommended over option 2.
?referenceName=17&assemblyId=GRCh38
Moving forward the Beacon v2+ specification will adopt the VRS referenceSequence
which uses the Refget definitions and include the prefixed Refseq chromosome id
use as option.
?referenceSequence=refseq:NC_000017.11
variantType
(v1/v2) and copyChange
(VQS)¶
In Beacon v1/v2 the variantType
parameter is a free text field and can be used
to scope queries for specific types of genomic variations. It is very permissive
and ambigouos, with frequent use of VCF terms but also other such as from SO
and EFO
.
With the current move of the Beacon v2+ specification towards the use of the VRS
concepts the future of a general variantType
equivalent for variation queries
has yet to be determined. However, VRS v2 provides a dedicated copyChange
parameter
as well as a dedicated vocabulary derived from EFO terminology.
Recommendation¶
- for CNV queries use the EFO terms for relative copy number changes for either
copyChange
orvariantType
parameters, depending on the Beacon version - in Beacon v2+ the
copyChange
parameter MUST use a primaryCode representing one of:- EFO:0030069: complete genomic loss
- EFO:0020073: high-level loss
- EFO:0030068: low-level loss
- EFO:0030067: loss
- EFO:0030064: regional base ploidy
- EFO:0030070: gain
- EFO:0030071: low-level gain
- EFO:0030072: high-level gain
- for other types of variations
variantType
according to query documentation and emerging examples variantType
might be replaced during future development