gtf_extract
: extract data items from GTF/GFF¶
Overview¶
The gtf_extract
utility extracts selected data items from a
GTF file and output in tab-delimited format.
Note
The program can also operate on GFF files provided the --gff
option is specified.
Usage and options¶
General usage syntax:
gtf_extract OPTIONS <gft_file>
Options:
-
--version
¶
show program’s version number and exit
-
-h
,
--help
¶
show the help message and exit
-
-f
FEATURE_TYPE
,
--feature
=FEATURE_TYPE
¶ only extract data for lines where feature is FEATURE_TYPE
-
--fields
=FIELD_LIST
¶ comma-separated list of fields to output in tab-delimited format for each line in the GTF, e.g.
chrom,start,end
.Fields can either be a GTF field name (i.e.
chrom
,source
,feature
,start
,end
,score
,strand
andframe
), or the name of an attribute (e.g.gene_name
,gene_id
etc).Data items are output in the order they appear in
FIELD_LIST
. If a field doesn’t exist for a line then'.'
will be output as the value.
-
-o
OUTFILE
¶ write output to OUTFILE (default is to write to stdout)
-
--gff
¶
specify that the input file is GFF rather than GTF format
Output¶
The program outputs a tab-delimited line of data for each matching line
found in the input GTF file; the data items in the line are those
specified by the --fields
option (or else all data items, if no fields
were specified).
For example, for --fields=chrom,start,end,strand
, the GTF line:
chr1 HAVANA gene 11869 14412 . + . gene_id "ENSG00000223972.4" ...
will produce the output:
chr1 11869 14412 +
By default the output of the program is written to stdout; use the
-o
option to direct the output to a named file instead.