gtf_extract: extract data items from GTF/GFF¶
Overview¶
The gtf_extract utility extracts selected data items from a
GTF file and output in tab-delimited format.
Note
The program can also operate on GFF files provided the --gff
option is specified.
Usage and options¶
General usage syntax:
gtf_extract OPTIONS <gft_file>
Options:
-
--version¶ show program’s version number and exit
-
-h,--help¶ show the help message and exit
-
-fFEATURE_TYPE,--feature=FEATURE_TYPE¶ only extract data for lines where feature is FEATURE_TYPE
-
--fields=FIELD_LIST¶ comma-separated list of fields to output in tab-delimited format for each line in the GTF, e.g.
chrom,start,end.Fields can either be a GTF field name (i.e.
chrom,source,feature,start,end,score,strandandframe), or the name of an attribute (e.g.gene_name,gene_idetc).Data items are output in the order they appear in
FIELD_LIST. If a field doesn’t exist for a line then'.'will be output as the value.
-
-oOUTFILE¶ write output to OUTFILE (default is to write to stdout)
-
--gff¶ specify that the input file is GFF rather than GTF format
Output¶
The program outputs a tab-delimited line of data for each matching line
found in the input GTF file; the data items in the line are those
specified by the --fields option (or else all data items, if no fields
were specified).
For example, for --fields=chrom,start,end,strand, the GTF line:
chr1 HAVANA gene 11869 14412 . + . gene_id "ENSG00000223972.4" ...
will produce the output:
chr1 11869 14412 +
By default the output of the program is written to stdout; use the
-o option to direct the output to a named file instead.