I like using wc to keep track of how much I’ve written and to keep track of my
wordcounts. However, wc is not the best choice to find out how many words
you’ve written in a LaTeX document – it’s better to use something like
texcount.
There are a lot of ways to accomplish this, but I wanted to highlight a simple CLI approach I’ve been using:
#!/bin/bash
texcount {$1} -total | awk '{print $NF}' | tail -n 8 | sed ':a;/[0-9]$/{N;s/\n/+/;ba}' | sed 's/$/0/g' | qalc | head -n 3 | tail -n 1 | awk '{print $NF}'
Let’s look at what this does, shall we?
texcountis the main character here. Its typical output is a bit long-winded for my tastes:
File: latexworkshop.tex
Encoding: utf8
Words in text: 210
Words in headers: 3
Words outside text (captions, etc.): 0
Number of headers: 1
Number of floats/tables/figures: 0
Number of math inlines: 0
Number of math displayed: 0
Subcounts:
text+headers+captions (#headers/#floats/#inlines/#displayed)
1+0+0 (0/0/0/0) _top_
209+3+0 (1/0/0/0) Section: LaTeX workshop proposal}\label{latex-workshop-proposal}
- This is a script, so
{$1}gets the first command-line argument;#!/bin/bashis the hashbang, and tells the shell which program it should use to execute the script. - I want my output to only have the number of words, so let’s use
-totalwithtexcount. awk '{print $NF}'gets the last word on each line – which is a number.tail -n 8gets the last 8 lines – skipping the filename.sed ':a;/[0-9]$/{N;s/\n/+/;ba}'puts all the numbers on the same line, with a+between each pair of numbers.sed 's/$/0/g'appends a0to the end (to take care of the trailing+).qalcis my preferred CLI caclulator. We feed the equation obtained in the previous step intoqalcand get the sum.head -n 3andtail -n 1isolate the line with the sum. Come to think of it, I think you couldgrepfor the=sign.- Finally,
awk '{print $NF}'gets the last word – which is the total wordcount.