
cloc counts blank lines, comment lines, and physical lines of source code in many programming languages. It is written entirely in Perl with no dependencies outside the standard distribution of Perl v5.6 and higher (code from some external modules is embedded within cloc) and so is quite portable. cloc is known to run on many flavors of Linux, AIX, Solaris, IRIX, z/OS, and Windows. (To run the Perl source version of cloc on Windows one needs ActiveState Perl 5.6.1 or higher, or Cygwin installed. Alternatively one can use the Windows binary of cloc generated with perl2exe to run on Windows computers that have neither Perl nor Cygwin.)
cloc contains code from David Wheeler's SLOCCount, Damian Conway and Abigail's Perl module Regexp::Common, and Sean M. Burke's Perl module Win32::Autoglob,

cloc has many features that make it easy to use, thorough, extensible, and portable:

If cloc does not suit your needs here are other freely available counters to consider:
Other references:
Although cloc does not need Perl modules outside those found in the
standard distribution, cloc does rely on a few external modules.
Code from two of these external modules--Regexp::Common
and Win32::Autoglob--is embedded within cloc. A third module,
Digest::MD5, is used only if it is available.
If cloc finds Regexp::Common installed locally it will use that
installation. If it doesn't, cloc will install the parts
of Regexp::Common it needs to a temporary directory that is created
at the start of a cloc run then removed when the run is complete.
The necessary code from Regexp::Common v2.120 is embedded within
the cloc source code (see subroutine Install_Regexp_Common() ).
Only three lines are needed from Win32::Autoglob and these are
included directly in cloc.
Additionally, cloc will use Digest::MD5 to validate uniqueness among input files if Digest::MD5 is installed locally. If Digest::MD5 is not found the file uniqueness check is skipped.
The Windows binary is built on a computer that has both Regexp::Common and Digest::MD5 installed locally.

cloc is a command line program takes file and/or directory names as inputs. Here's an example of running cloc against the Perl v5.8.8 source distribution:
prompt> tar zxf perl-5.8.8.tar.gz
prompt> cloc perl-5.8.8/
3106 text files.
2975 unique files.
1132 files ignored.
http://cloc.sourceforge.net v 0.90 T=70.0 s (27.9 files/s, 9480.5 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code scale 3rd gen. equiv
-------------------------------------------------------------------------------
Perl 1564 73960 88294 217162 x 4.00 = 868648.00
C 115 14872 17107 120583 x 0.77 = 92848.91
C/C++ Header 132 8426 21237 45229 x 1.00 = 45229.00
Bourne Shell 111 2987 5346 32954 x 3.81 = 125554.74
Lisp 1 583 1772 6121 x 1.25 = 7651.25
Make 8 479 459 2113 x 2.50 = 5282.50
Teamcenter def 2 0 0 1345 x 1.00 = 1345.00
yacc 2 125 72 1047 x 1.51 = 1580.97
C++ 3 101 214 444 x 1.51 = 670.44
DOS Batch 11 85 50 322 x 0.63 = 202.86
HTML 1 19 2 98 x 1.90 = 186.20
Java 2 6 1 23 x 1.36 = 31.28
-------------------------------------------------------------------------------
SUM: 1952 101643 134554 427441 x 2.69 = 1149231.15
-------------------------------------------------------------------------------
To run cloc on Windows computers one must first open up a command (aka DOS) window and invoke cloc.exe from the command line there.
The untar command in the above example is actually unnecessary as cloc can be told to work directly with compressed files. See the Advanced Use section for details.

prompt> cloc
Usage: cloc [options] <file(s)/dir(s)> | <report files>
Count physical lines of source code in the given files and/or
recursively below the given directories.
Options:
--by-file Report results for every source file encountered
in addition to reporting by language.
--categorized=<file> Save names of categorized files to <file>.
--counted=<file> Save names of processed source files to <file>.
--exclude-dir=<D1>[,D2,] Exclude the given comma separated directories
D1, D2, D3, et cetera, from being scanned. For
example --exclude-dir=.cvs,.svn will skip
all files that have /.cvs/ or /.svn/ as part of
their path.
--exclude-lang=<L1>[,L2,] Exclude the given comma separated languages
L1, L2, L3, et cetera, from being counted.
--extract-with=<cmd> Use <cmd> to extract binary archive files (e.g.:
.tar.gz, .zip, .Z). Use the literal '>FILE<' as
a stand-in for the actual file(s) to be
extracted. For example, to count lines of code
in the input files
gcc-4.2.tar.gz perl-5.8.8.tar.gz
on Unix use
--extract-with='gzip -dc >FILE< | tar xfv -'
and on Windows use:
--extract-with="\"c:\Program Files\WinZip\WinZip32.exe\" -e -o >FILE< ."
(if you have WinZip installed there).
--force-lang=<lang>[,<ext>]
Process all files that have a <ext> extension
with the counter for language <lang>. For
example, to count all .f files with the
Fortran 90 counter (which expects files to
end with .f90) instead of the default Fortran 77
counter, use
--force-lang="Fortran 90",f
If <ext> is omitted, every file will be counted
with the <lang> counter. This option can be
specified multiple times (but that is only
useful when <ext> is given each time).
See also --script-lang.
--found=<file> Save names of every file found to <file>.
--ignored=<file> Save names of ignored files and the reason they
were ignored to <file>.
--no3 Suppress third-generation language output.
This option can cause report summation to fail
if some reports were produced with this option
while others were produced without it.
--print-filter-stages Print to STDOUT processed source code before and
after each filter is applied.
--progress-rate=<n> Show progress update after every <n> files are
processed (default <n>=100).
--quiet Suppress all information messages except for
the final report.
--report-file=<file> Write the results to <file> instead of STDOUT.
--read-lang-def=<file> Load from <file> the language processing filters.
(see also --write-lang-def) then use these filters
instead of the built-in filters.
--script-lang=<lang>,<s> Process all files that invoke <s> as a #!
scripting language with the counter for language
<lang>. For example, files that begin with
#!/usr/local/bin/perl5.8.8
will be counted with the Perl counter by using
--script-lang=Perl,perl5.8.8
The language name is case insensitive but the
name of the script language executable, <s>,
must have the right case. This option can be
specified multiple times. See also --force-lang.
--sdir=<dir> Use <dir> as the scratch directory instead of
letting File::Temp chose the location. Files
written to this location are not removed at
the end of the run (as they are with File::Temp).
--show-ext[=<ext>] Print information about all known (or just the
given) file extensions and exit.
--show-lang[=<lang>] Print information about all known (or just the
given) languages and exit.
--strip-comments=<ext> For each file processed, write to the current
directory a version of the file which has blank
lines and comments removed. The name of each
stripped file is the original file name with
.<ext> appended to it.
--sum-reports Input arguments are report files previously
created with the --report-file option. Makes
a cumulative set of results containing the
sum of data from the individual report files.
--write-lang-def=<file> Writes to <file> the language processing filters
then exits. Useful as a first step to creating
custom language definitions (see --read-lang-def).
-v[=<n>] Verbose switch (optional numeric value).
--version Print the version of this program and exit.
--csv Write the results as comma separated values.
--xml Write the results in XML.
--yaml Write the results in YAML.

prompt> cloc --show-lang ABAP (abap) Ada (ada, adb, ads, pad) ADSO/IDSM (adso) AMPLE (ample, dofile, startup) ASP (asa, asp) ASP.Net (asax, ascx, asmx, aspx, config, master, sitemap, webinfo) Assembler (asm, S, s) awk (awk) Bourne Again Shell (bash) Bourne Shell (sh) C (c, ec, pgc) C Shell (csh, tcsh) C# (cs) C++ (C, cc, cpp, cxx, pcc) C/C++ Header (H, h, hh, hpp) CCS (ccs) COBOL (cbl, CBL, cob, COB) ColdFusion (cfm) CSS (css) DAL (da) DOS Batch (bat, BAT) DTD (dtd) Expect (exp) Focus (focexec) Fortran 77 (F, f, f77, F77, pfo) Fortran 90 (F90, f90) Fortran 95 (F95, f95) Haskell (hs, lhs) HTML (htm, html) IDL (idl) inc (inc) Java (java) Javascript (js) JCL (jcl) JSP (jsp) Korn Shell (ksh) lex (l) Lisp (cl, el, jl, lsp, sc, scm) LiveLink OScript (oscript) Lua (lua) m4 (ac, m4) make (am, gnumakefile, Gnumakefile, Makefile, makefile) MATLAB (m) ML (ml, mli) Modula3 (i3, ig, m3, mg) MSBuild scripts (csproj, wdproj) MUMPS (mps, m) NAnt scripts (build) NASTRAN DMAP (dmap) Objective C (m) Oracle Forms (fmt) Oracle Reports (rex) Pascal (dpr, p, pas, pp) Patran Command Language (pcl, ses) Perl (perl, PL, pl, plh, plx, pm) PHP (php, php3, php4, php5) Python (py) Rexx (rexx) Ruby (rb) sed (sed) SKILL (il) SKILL++ (ils) Softbridge Basic (sbl, SBL) SQL (psql, sql) Tcl/Tk (itk, tcl, tk) Teamcenter def (def) Teamcenter met (met) Teamcenter mth (mth) vim script (vim) Visual Basic (bas, cls, frm, vb, vba, vbs) XML (xml) XSD (xsd) XSLT (xsl, xslt) yacc (y) YAML (yaml, yml)
MATLAB, MUMPS, and Objective C are the only recognized languages which
map to the same file extension, .m.
cloc has a subroutine which attempts to identify the right language
based on the file's contents.
The above list can be customized by reading language definitions
from a file with the --read-lang-def option.

cloc's method of operation resembles SLOCCount's: First, create a list of files to consider. Next, attempt to determine whether or not found files contain recognized computer language source code. Finally, for files identified as source files, invoke language-specific routines to count the number of source lines.
A more detailed description:
--show-lang
and --show-ext options). Files which match are
classified as containing source code for that language.
Each file without an extensions is opened and its first
line read to see
if it is a Unix shell script (anything that begins with #!).
If it is shell script, the file is classified by that scripting
language (if the language is recognized). If the file does not
have a recognized extension or is not a recognzied
scripting language, the file is ignored.
// and
(2) remove text between /* and */)
Apply each filter to the code to remove comments.
Count the left over lines (= Lcode).
The options modify the algorithm slightly. The
--read-lang-def option for example allows the user to
read definitions of comment filters, known file extensions, and known
scripting languages from a file. The code for this option is processed
between Steps 2 and 3.


How can you tell if cloc correctly identifies comments? One way to convince yourself cloc is doing the right thing is to use its --strip-comments option to remove comments and blank lines from files, then compare the stripped-down files to originals.
Let's try this out with the SQLite amalgamation, a C file containing all code needed to build the SQLite library along with a header file:
prompt> tar zxf sqlite-amalgamation-3.5.6.tar.gz
prompt> cd sqlite-3.5.6/
prompt> cloc --strip-comments=nc sqlite.c
1 text file.
1 unique file.
Wrote sqlite3.c.nc
0 files ignored.
http://cloc.sourceforge.net v 1.03 T=1.0 s (1.0 files/s, 82895.0 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code scale 3rd gen. equiv
-------------------------------------------------------------------------------
C 1 5167 26827 50901 x 0.77 = 39193.77
-------------------------------------------------------------------------------
The extention argument given to --strip-comments is arbitrary; here nc was used as an abbreviation for "no comments".
cloc removed over 31,000 lines from the file:
prompt> wc -l sqlite3.c sqlite3.c.nc 82895 sqlite3.c 50901 sqlite3.c.nc 133796 total prompt> echo "82895 - 50901" | bc 31994
We can now compare the orignial file, sqlite3.c and the one stripped of comments, sqlite3.c.nc with tools like diff or vimdiff and see what exactly cloc considered comments and blank lines. A rigorous proof that the stripped-down file contains the same C code as the original is to compile these files and compare checksums of the resulting object files.
First, the original source file:
prompt> gcc -c sqlite3.c prompt> md5sum sqlite3.o cce5f1a2ea27c7e44b2e1047e2588b49 sqlite3.o
Next, the version without comments:
prompt> mv sqlite3.c.nc sqlite3.c prompt> gcc -c sqlite3.c prompt> md5sum sqlite3.o cce5f1a2ea27c7e44b2e1047e2588b49 sqlite3.ocloc removed over 31,000 lines of comments and blanks but did not modify the source code in any significant way since the resulting object file matches the original.

cloc's --extract-with=<cmd>
option allows one to count lines of code within tar files, Zip files, or
other compressed archives for which one has an extraction tool.
cloc takes the user-provided extraction command and expands the archive
to a temporary directory (created with File::Temp),
counts the lines of code in the temporary directory,
then removes that directory. While not especially helpful when dealing
with a single compressed archive (after all, if you're going to type
the extraction command anyway why not just manually expand the archive?)
this option is handy for working with several archives at once.
For example, say you have the following source tarballs on a Unix machine
perl-5.8.5.tar.gz
Python-2.4.2.tar.gz
and you want to count all the code within them. The command would be
cloc --extract-with='gzip -dc >FILE< | tar xf -' perl-5.8.5.tar.gz Python-2.4.2.tar.gzIf that Unix machine has GNU tar (which can uncompress and extract in one step) the command can be shortened to
cloc --extract-with='tar zxf >FILE<' perl-5.8.5.tar.gz Python-2.4.2.tar.gzOn a Windows computer with WinZip installed in
c:\Program Files\WinZip the command would look like
cloc.exe --extract-with="\"c:\Program Files\WinZip\WinZip32.exe\" -e -o >FILE< ." perl-5.8.5.tar.gz Python-2.4.2.tar.gzJava
.ear files are Zip files that contain additional Zip files. cloc can handle nested compressed archives without difficulty--provided all such files are compressed and archived in the same way. Examples of counting a
Java .ear file in Unix and Windows:
Unix> cloc --extract-with="unzip -d . >FILE< " Project.ear DOS> cloc.exe --extract-with="\"c:\Program Files\WinZip\WinZip32.exe\" -e -o >FILE< ." Project.ear

cloc can write its language comment definitions to a file or can read comment definitions from a file, overriding the built-in definitions. This can be useful when you want to use cloc to count lines of a language not yet included, to change association of file extensions to languages, or to modify the way existing languages are counted.
The easiest way to create a custom language definition file is to make cloc write its definitions to a file, then modify that file:
Unix> cloc --write-lang-def=my_definitions.txtcreates the file
my_definitions.txt which can be modified
then read back in with
Unix> cloc --read-lang-def=my_definitions.txt file1 file2 dir1 ...
Each language entry has four parts:
C++
filter remove_matches ^\s*//
filter call_regexp_common C
extension C
extension cc
extension cpp
extension cxx
extension pcc
3rd_gen_scale 1.51
C++ has two filters: first, remove lines that start with optional
whitespace and are followed by //.
Next, remove all C comments. C comments are difficult to express
as regular expressions so a call is made to Regexp::Common to get the
appropriate regular expression to match C comments which are then removed.
A more complete discussion of the different filter options may appear
here in the future. The output of cloc's
--write-lang-def option should provide enough examples
for motivated individuals to modify or extend cloc's language definitions.

If you manage multiple software projects you might be interested in seeing line counts by project, not just by language. Say you manage three software projects called MySQL, PostgreSQL, and SQLite. The teams responsible for each of these projects run cloc on their source code and provide you with the output. For example MySQL team does
cloc --report-file=mysql-5.0.24a.txt --extract-with='tar zxf >FILE<' mysql-5.0.24a.tar.gzand provides you with the file
mysql-5.0.24a.txt.
The contents of the three files you get are
Unix> cat mysql-5.0.24a.txt http://cloc.sourceforge.net v 0.72 T=300.0 s (10.3 files/s, 5785.6 lines/s) ------------------------------------------------------------------------------- Language files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- C++ 636 87248 108470 536619 x 1.51 = 810294.69 C 790 74025 85474 412662 x 0.77 = 317749.74 C/C++ Header 924 26969 53128 122434 x 1.00 = 122434.00 Bourne Shell 212 16081 18048 113940 x 3.81 = 434111.40 Tcl/Tk 235 5276 7497 30484 x 1.25 = 38105.00 Perl 31 1731 1512 7931 x 4.00 = 31724.00 Java 131 1374 1358 7686 x 1.36 = 10452.96 XML 25 540 22 3914 x 1.90 = 7436.60 SQL 8 173 56 2673 x 2.29 = 6121.17 HTML 13 244 22 2097 x 1.90 = 3984.30 awk 13 176 337 1967 x 3.81 = 7494.27 Assembler 14 169 0 1357 x 0.25 = 339.25 sed 1 0 0 772 x 4.00 = 3088.00 Teamcenter def 30 90 117 722 x 1.00 = 722.00 Make 10 40 19 203 x 2.50 = 507.50 DOS Batch 3 12 3 17 x 0.63 = 10.71 ------------------------------------------------------------------------------- SUM: 3076 214148 276063 1245478 x 1.44 = 1794575.59 ------------------------------------------------------------------------------- Unix> cat sqlite-3.3.7.txt http://cloc.sourceforge.net v 0.72 T=49.0 s (3.0 files/s, 2733.8 lines/s) ------------------------------------------------------------------------------- Language files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- C 65 4603 20237 49674 x 0.77 = 38248.98 Bourne Shell 8 3050 4218 24223 x 3.81 = 92289.63 Tcl/Tk 51 2609 911 18017 x 1.25 = 22521.25 C/C++ Header 10 234 1402 2194 x 1.00 = 2194.00 yacc 1 108 41 933 x 1.51 = 1408.83 HTML 2 128 0 873 x 1.90 = 1658.70 awk 6 6 82 180 x 3.81 = 685.80 Teamcenter def 1 0 0 101 x 1.00 = 101.00 Make 1 19 89 22 x 2.50 = 55.00 ------------------------------------------------------------------------------- SUM: 145 10757 26980 96217 x 1.65 = 159163.19 ------------------------------------------------------------------------------- Unix> cat postgresql-8.1.4.txt http://cloc.sourceforge.net v 0.72 T=211.0 s (11.8 files/s, 5676.6 lines/s) ------------------------------------------------------------------------------- Language files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- HTML 693 4703 19 412348 x 1.90 = 783461.20 C 743 75657 126646 407089 x 0.77 = 313458.53 C/C++ Header 476 7822 19293 32771 x 1.00 = 32771.00 Bourne Shell 48 2933 2897 27396 x 3.81 = 104378.76 SQL 185 5564 4216 17864 x 2.29 = 40908.56 lex 118 978 1346 15799 x 1.00 = 15799.00 yacc 6 1958 2399 14178 x 1.51 = 21408.78 Perl 30 1262 883 4356 x 4.00 = 17424.00 Make 172 1425 1349 3678 x 2.50 = 9195.00 Teamcenter def 4 1 0 525 x 1.00 = 525.00 XSL 2 49 30 137 x 1.90 = 260.30 Assembler 3 9 0 102 x 0.25 = 25.50 awk 1 3 30 20 x 3.81 = 76.20 Python 1 5 1 12 x 4.20 = 50.40 ------------------------------------------------------------------------------- SUM: 2482 102369 159109 936275 x 1.43 = 1339742.23 -------------------------------------------------------------------------------
While these three files are interesting, you also want to see
the combined counts from all projects.
That can be done with cloc's --sum_reports
option:
Unix> cloc --sum-reports --report_file=databases mysql-5.0.24a.txt sqlite-3.3.7.txt postgresql-8.1.4.txt Wrote databases.lang Wrote databases.fileThe report combination produces two output files, one for sums by programming language (
databases.lang) and one by project
(databases.file).
Their contents are
Unix> cat databases.lang http://cloc.sourceforge.net v 0.72 ------------------------------------------------------------------------------- Language files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- C 1598 154285 232357 869425 x 0.77 = 669457.25 C++ 636 87248 108470 536619 x 1.51 = 810294.69 HTML 708 5075 41 415318 x 1.90 = 789104.20 Bourne Shell 268 22064 25163 165559 x 3.81 = 630779.79 C/C++ Header 1410 35025 73823 157399 x 1.00 = 157399.00 Tcl/Tk 286 7885 8408 48501 x 1.25 = 60626.25 SQL 193 5737 4272 20537 x 2.29 = 47029.73 lex 118 978 1346 15799 x 1.00 = 15799.00 yacc 7 2066 2440 15111 x 1.51 = 22817.61 Perl 61 2993 2395 12287 x 4.00 = 49148.00 Java 131 1374 1358 7686 x 1.36 = 10452.96 XML 25 540 22 3914 x 1.90 = 7436.60 Make 183 1484 1457 3903 x 2.50 = 9757.50 awk 20 185 449 2167 x 3.81 = 8256.27 Assembler 17 178 0 1459 x 0.25 = 364.75 Teamcenter def 35 91 117 1348 x 1.00 = 1348.00 sed 1 0 0 772 x 4.00 = 3088.00 XSL 2 49 30 137 x 1.90 = 260.30 DOS Batch 3 12 3 17 x 0.63 = 10.71 Python 1 5 1 12 x 4.20 = 50.40 ------------------------------------------------------------------------------- SUM: 5703 327274 462152 2277970 x 1.45 = 3293481.01 ------------------------------------------------------------------------------- Unix> cat databases.file http://cloc.sourceforge.net v 0.72 ------------------------------------------------------------------------------- Report File files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- mysql-5.0.24a.txt 3076 214148 276063 1245478 x 1.44 = 1794575.59 postgresql-8.1.4.txt 2482 102369 159109 936275 x 1.43 = 1339742.23 sqlite-3.3.7.txt 145 10757 26980 96217 x 1.65 = 159163.19 ------------------------------------------------------------------------------- SUM: 5703 327274 462152 2277970 x 1.45 = 3293481.01 -------------------------------------------------------------------------------
Report files themselves can be summed together. Say you also manage development of Perl and Python and you want to keep track of those line counts separately from your database projects. First create reports for Perl and Python separately:
cloc --report-file=perl-5.8.8.txt --extract-with='tar zxf >FILE<' perl-5.8.8.tar.gz cloc --report-file=python-2.4.2.txt --extract-with='tar jxf >FILE<' Python-2.4.2.tar.bz2then sum these together with
Unix> cloc --sum-reports --report_file=script_lang perl-5.8.8.txt python-2.4.2.txt Wrote script_lang.lang Wrote script_lang.file Unix> cat script_lang.lang http://cloc.sourceforge.net v 0.72 ------------------------------------------------------------------------------- Language files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- C 409 46920 35958 383652 x 0.77 = 295412.04 Python 1605 55998 31886 309549 x 4.20 = 1300105.80 Perl 1576 74568 89136 220919 x 4.00 = 883676.00 C/C++ Header 280 12169 26366 88089 x 1.00 = 88089.00 Bourne Shell 146 5201 7428 52115 x 3.81 = 198558.15 Lisp 4 1120 2291 9799 x 1.25 = 12248.75 Make 17 1092 939 5348 x 2.50 = 13370.00 Teamcenter def 10 144 88 3163 x 1.00 = 3163.00 HTML 22 516 2 2769 x 1.90 = 5261.10 yacc 2 125 72 1047 x 1.51 = 1580.97 XML 2 103 32 894 x 1.90 = 1698.60 Objective C 6 102 19 704 x 2.96 = 2083.84 C++ 4 104 215 451 x 1.51 = 681.01 DOS Batch 14 93 73 387 x 0.63 = 243.81 Expect 1 0 0 60 x 2.00 = 120.00 Java 2 6 1 23 x 1.36 = 31.28 sed 1 0 1 2 x 4.00 = 8.00 ------------------------------------------------------------------------------- SUM: 4101 198261 194507 1078971 x 2.60 = 2806331.35 ------------------------------------------------------------------------------- Unix> cat script_lang.file http://cloc.sourceforge.net v 0.72 ------------------------------------------------------------------------------- Report File files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- python-2.4.2.txt 2149 96618 60365 651118 x 2.54 = 1656782.96 perl-5.8.8.txt 1952 101643 134142 427853 x 2.69 = 1149548.39 ------------------------------------------------------------------------------- SUM: 4101 198261 194507 1078971 x 2.60 = 2806331.35 -------------------------------------------------------------------------------Finally, combine the combination files:
Unix> cloc --sum-reports --report_file=everything databases.lang script_lang.lang Wrote everything.lang Wrote everything.file Unix> cat everything.lang http://cloc.sourceforge.net v 0.72 ------------------------------------------------------------------------------- Language files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- C 2007 201205 268315 1253077 x 0.77 = 964869.29 C++ 640 87352 108685 537070 x 1.51 = 810975.70 HTML 730 5591 43 418087 x 1.90 = 794365.30 Python 1606 56003 31887 309561 x 4.20 = 1300156.20 C/C++ Header 1690 47194 100189 245488 x 1.00 = 245488.00 Perl 1637 77561 91531 233206 x 4.00 = 932824.00 Bourne Shell 414 27265 32591 217674 x 3.81 = 829337.94 Tcl/Tk 286 7885 8408 48501 x 1.25 = 60626.25 SQL 193 5737 4272 20537 x 2.29 = 47029.73 yacc 9 2191 2512 16158 x 1.51 = 24398.58 lex 118 978 1346 15799 x 1.00 = 15799.00 Lisp 4 1120 2291 9799 x 1.25 = 12248.75 Make 200 2576 2396 9251 x 2.50 = 23127.50 Java 133 1380 1359 7709 x 1.36 = 10484.24 XML 27 643 54 4808 x 1.90 = 9135.20 Teamcenter def 45 235 205 4511 x 1.00 = 4511.00 awk 20 185 449 2167 x 3.81 = 8256.27 Assembler 17 178 0 1459 x 0.25 = 364.75 sed 2 0 1 774 x 4.00 = 3096.00 Objective C 6 102 19 704 x 2.96 = 2083.84 DOS Batch 17 105 76 404 x 0.63 = 254.52 XSL 2 49 30 137 x 1.90 = 260.30 Expect 1 0 0 60 x 2.00 = 120.00 ------------------------------------------------------------------------------- SUM: 9804 525535 656659 3356941 x 1.82 = 6099812.36 ------------------------------------------------------------------------------- Unix> cat everything.file http://cloc.sourceforge.net v 0.72 ------------------------------------------------------------------------------- Report File files blank comment code scale 3rd gen. equiv ------------------------------------------------------------------------------- databases.lang 5703 327274 462152 2277970 x 1.45 = 3293481.01 script_lang.lang 4101 198261 194507 1078971 x 2.60 = 2806331.35 ------------------------------------------------------------------------------- SUM: 9804 525535 656659 3356941 x 1.82 = 6099812.36 -------------------------------------------------------------------------------

The last two columns of cloc's output, "scale" and "3rd. gen. equiv." are rough indications of how many lines of code would be needed by a hypothetical third-generation computer language. The values in these columns should be taken with a large grain of salt. They can be suppressed entirely with the --no3 option to produce cleaner output. Here's what the output looks like for the same Perl 5.8.8 count shown above:
prompt> cloc --no3 --extract-with='tar zxf >FILE<' perl-5.8.8.tar.gz
tar zxf perl-5.8.8.tar.gz
3106 text files.
2975 unique files.
1132 files ignored.
http://cloc.sourceforge.net v 0.90 T=70.0 s (27.9 files/s, 9480.5 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Perl 1564 73960 88294 217162
C 115 14872 17107 120583
C/C++ Header 132 8426 21237 45229
Bourne Shell 111 2987 5346 32954
Lisp 1 583 1772 6121
Make 8 479 459 2113
Teamcenter def 2 0 0 1345
yacc 2 125 72 1047
C++ 3 101 214 444
DOS Batch 11 85 50 322
HTML 1 19 2 98
Java 2 6 1 23
-------------------------------------------------------------------------------
SUM: 1952 101643 134554 427441
-------------------------------------------------------------------------------
If you use the report summation feature, make sure all inputs were produced the same way, either all with the --no3 option or all without.

Identifying comments within source code is trickier than one might expect. Many languages would need a complete parser to be counted correctly. cloc does not attempt to parse any of the languages it aims to count and therefore is an imperfect tool. The following are known problems:
printf(" /* ");
for (i = 0; i < 100; i++) {
a += i;
}
printf(" */ ");
appear to cloc as two lines of C code (parts of the two printf()
lines) and three lines of comments (the entire for loop).

Al Danial

Wolfram Rösler provided most of the code examples in the test suite. These examples come from his Hello World Collection.
Ismet Kursunoglu found errors with the MUMPS counter and provided access to a computer with a large body of MUMPS code to test cloc.
Tod Huggins gave helpful suggestions for the Visual Basic filters.
Anton Demichev found a flaw with the JSP counter in cloc v0.76 and wrote the XML ouput generator for the --xml option.
Reuben Thomas pointed out that ISO C99 allows // as a comment marker, provided code for the --no3 option and for counting the m4 language, and suggested several user-interface enhancements.
The development of cloc was partially funded by the Northrop Grumman Corporation.

Copyright (c) 2006-2008, Northrop Grumman Corporation /
Information Technology / IT Solutions

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.