cb2Bib is a free, open source, and multiplatform application for rapidly extracting unformatted, or unstandardized bibliographic references from email alerts, journal Web pages, and PDF files.
cb2Bib facilitates the capture of single references from unformatted and non standard sources. Output references are written in BibTeX. Article files can be easily linked and renamed by dragging them onto the cb2Bib window. Additionally, it permits editing and browsing BibTeX files, citing references, searching references and the full contents of the referenced documents, inserting bibliographic metadata to documents, and writing short notes that interrelate several references.
Current version: cb2Bib 2.0.3. See Change Log File for a detailed list of changes and acknowledgments, and Release Notes for additional notes and information.
See also Release Note cb2Bib 2.0.3.
Section Contents
cb2Bib Description
cb2Bib reads the clipboard text contents and process it against a set of predefined patterns. If this automatic detection is successful, cb2Bib formats the clipboard data according to the structured BibTeX reference syntax.
Otherwise, if no predefined format pattern is found or if detection proves to be difficult, manual data extraction is greatly simplified by cb2Bib. In most cases, such manual data extraction will provide with a new, personalized pattern to be included within the predefined pattern set for future automatic extractions.
Once the bibliographic reference is correctly extracted, it is added to a specified BibTeX database file. Optionally, document files are renamed to a DocumentID filename and moved to a desired directory as a personal article library, and their metadata is updated with the bibliographic reference. See Reading and Writing Bibliographic Metadata section.
cb2Bib facilitates writing short notes related to bibliographic collections. Notes are written using a minimalist markup on a plain text editor, and can latter be converted to HTML. Related references and links become easily accessible on any browser or by the embedded cb2Bib viewer. See Release Note cb2Bib 1.1.0.
Using cb2Bib
Procedure
- Select the reference to import from the email or web browser
On Unix machines, cb2Bib automatically detects mouse selections and clipboard changes. On Windows machines, copy or Ctrl-C is necessary to activate cb2Bib automatic processing. - cb2Bib automatic processing
Once text is selected cb2Bib initiates the automatic reference extraction. It uses the predefined patterns from fileregexp.txtto attempt automatic extraction. See Configuring Files section for setting the user predefined pattern matching expression file. After a successful detection bibliographic fields appear on the cb2Bib item line edits. Manual editing is possible at this stage. - cb2Bib manual processing
If no predefined format pattern is found or if detection proves to be difficult, a manual data extraction must be performed. Select, either using mouse or Shift+arrow keys, the reference fields from the cb2Bib clipboard area. A popup menu will appear after selection is made. Choose the corresponding bibliographic field. See BiBTeX Entry Types Available as cb2Bib Fields. If operating with the keyboard, first letter of the field is set as a menu shortcut. Then, typing ‘A’ sets selection to ‘author’, or ‘+A’ to ‘add authors’. Selection is postprocessed and added to the cb2Bib item line edit. cb2Bib field tags will show on the cb2Bib clipboard area. Once the manual processing is done, cb2Bib clipboard area will contain the matching pattern. The pattern can be further edited and stored to theregexp.txtfile using Insert Regular Expression, Alt+I. See the Extracting Data from the Clipboard and Regular Expression Editor sections. - Download reference to cb2Bib
cb2Bib has the built-in functionality to interact with publishers “Download reference to Citation Manager” service. Choose BibTeX format, or any other format that you can translate using External Clipboard Preparsing Command. See Additional Keyboard Functionality, Alt C. Click “Download” from your browser. When asked “Open with…” select cb2Bib. cb2Bib will be launched if no running instance is found. If already running, it will place the downloaded reference to the clipboard, and it will start processing. Make sure your running instance is aware of clipboard changes. See Buttons Functionality. For convenience, the shell scriptc2bimport, and the desktop config filec2bimport.desktopare also provided. - Adding documents
PDF and other documents can be added to the BibTeX reference by dragging the file icon and dropping it into the cb2Bib’s panel. Optionally, document files, are renamed to a DocumentID filename and moved to a desired directory as a personal article library (See Configuring Documents section). Linked to a reference documents correspond to the BibTeX tagfile. Usual reference manager software will permit to retrieve and visualize these files. Download, copy and/or moving is scheduled and performed once the reference is accepted, e.g., once it is saved by pressing Save Reference button. - Multiple retrieving from PDF files
Multiple PDF or convertible to text files can be sequentially processed by dragging a set of files into cb2Bib’s PDFImport dialog. By starting the processing button, files are sequentially converted to text and send to cb2Bib clipboard panel for reference extraction. See PDF Reference Import for details. - Journal-Volume-Page Queries
Takes input Journal, Volume, and first page from the corresponding edit lines and attempts to complete the reference. Additionally, queries considertitle,DOI, and anexcerpt, which is a simplified clipboard panel contents. See Configuring Network section, the distribution filenetqinf.txt, and Release Note cb2Bib 0.3.5 for customization and details. - BibTeX Editor
cb2Bib includes a practical text editor suitable for corrections and additions. cb2Bib capabilities are readily available within the editor. E.g., the reference is first sent to cb2Bib by selecting it, and later retrieved from cb2Bib to the editor using ‘right click’ + ‘Paste Current BibTeX’. Interconversions Unicode <-> LaTeX, long <-> abbreviated journal name, and adding/renaming PDF files are easily available. BibTeX Editor is also accessible through a shell command line. See cb2Bib Command Line and Embedded File Editor.
Buttons Functionality
- About
About cb2Bib, bookmarks, and online help. - Configure
Configure cb2Bib. See Configuration section. - Search references
Opens the cb2Bib’s search dialog. The search is performed either on the current BibTeX file, or for all BibTeX files on the current directory. Optionally, the search is extended to reference’s files. Hits are displayed on an editor window. See Search BibTeX and PDF Document Files. See also Configuring Utilities section to configure the external to text converter. - PDFImport
Launches cb2Bib’s PDFImport window. Files dragged into PDFImport window are sequentially translated to text and sent to cb2Bib clipboard panel. cb2Bib automatic and manual capabilities are then easily available to extract and supervise reference extractions. See PDF Reference Import. - Exit
Exits cb2Bib. - Dis/Connect Clipboard
Toggles automatic cb2Bib and desktop clipboard connection. While the automatic cb2Bib-clipboard connection permits reducing keystrokes, the disconnected mode is needed in cases where multiple mouse selections or copies are required to complete a reference extraction. See also Release Note cb2Bib 0.4.1 and Release Note cb2Bib 0.2.1 if you experience problems with this feature. - Network Reference Query
Starts Network Query. It usually takes input Journal, Volume, and first page from the corresponding edit lines and attempts to complete the reference. See Configuring Network network section to customize querying. See the distribution filenetqinf.txtand also Release Note cb2Bib 0.3.5 for the details. - View BibTeX Reference
View current reference as will be output to the BibTeX file. Eventual manual changes should be done on the item line edit. - Save Reference
Inserts the current bibliographic reference to the output BibTeX file. This action decides whether or not a reference is accepted. Scheduled actions such as PDF downloading, copying or renaming will be performed at this time. - Open BibTeX File
Opens the current BibTeX output file. Right click within the BibTeX Editor window for its particular functionality. See also Embedded File Editor.
Additional Keyboard Functionality
Most keyboard shortcuts are customizable. See Configuring Shortcuts. In the following, default shortcuts are used to describe functionality.
- Alt A
Starts cb2Bib Annote. Specify the note’s filename in the dialog. A new note is created if the file name does not exist. The cb2Bib Annote is opened as a separate program. Exiting cb2Bib will not exit the note’s viewer. On the viewer, pressing key E launches the default text editor. The viewer will track the editor, and will update the note’s display each time the editor saves it. The viewer’s functionality is disabled if cb2Bib was not compiled and linked against QtWebKit or QtWebEngine library. See cb2Bib Command Line to use Annote in command line mode. - Alt B
Edits the Bookmarks and Network Query Info filenetqinf.txt. - Alt C
Preparses cb2Bib’s clipboard through a user specified external script or tool. Preparsing is necessary to catch formatted references that can not be easily extracted using recognition patterns, or that are written in ambiguous formats. Many available scripts or specific user-written tools can be incorporated to cb2Bib through this external preparsing capability. In addition, simple, one-line scripts can be used within PDFImport to provide, for instance, the journal name when missing from the PDF first page. The cb2Bib distribution contains the sample scriptsisi2bibandris2bibthat convert ISI and RIS formatted strings to BibTeX. See Configuring Clipboard for details. - Alt D
Deletes temporary BibTeX output file. This permits using cb2Bib output files as temporary media to transfer references to a preferred reference manager and preferred format. Caution: This feature is not intended for the users who actually store their references in one or several BibTeX files. Remember to import references prior to delete cb2Bib output file. - Alt E
Edits the regular expression file. It permits an easy access and modification of stored extraction patterns. New patterns are conveniently added to to the regular expression file by using the RegExp Editor button functionality. - Alt F
Launches a file dialog for selecting the source file name for the BibTeX entryfile. Selected files are displayed either, as the actual source filename, or, as the target filename, depending on the file copy/rename/move settings. See Configuring Documents. Alternatively to Alt F, documents can be easily linked to a reference by dragging the document file and dropping it to the cb2Bib panel. - Alt I
Edits and optionally inserts the current regular expression pattern. See the Extracting Data from the Clipboard and Regular Expression Editor sections. - Alt J
Edits the Journal Abbreviations file. - Alt O
Opens the currently linked document for browsing. Documents can be easily linked to a reference by dragging the document file and dropping it to the cb2Bib panel, or with Alt F. Linked documents correspond to the BibTeX tagfile. - Alt P
Postprocess BibTeX output file. It launches a user specified script or program to postprocess the current BibTeX file. The cb2Bib distribution contains two sample scripts. One,bib2pdfis a shell script for runninglatexandbibtex; this permits to check the BibTeX file for possible errors, and to easily produce a suitable output for printing. The other one,bib2end.batis a batch script for runningbib2xmlandxml2end, which converts references into Endnote format. See Configuring BibTeX for details. - Alt R
Restarts cb2Bib automatic engine. Takes input data not from the system clipboard but from the cb2Bib clipboard panel. This permits editting the input stream from poorly translated PDF captions, correcting for author superscripts, or helps in debugging regular expressions. - Alt W
Writes current reference to the source document file. This option is intended for writing and updating bibliographic metadata to document files without needing to use BibTeX files. Only local and writable files are considered. - Alt X
Check Repeated looks for existing references in the BibTeX directory similar to the current one. The search is done for exact cite ID, and for title and author field values, or, if empty, for booktitle and editor,using the approximate string search pattern. See also Configuring BibTeX. - F4
Toggles between Main and Other Fields reference edit tabs. - Esc
Quits cb2Bib popup menu. The cb2Bib menu pops up each time a selection in made in the clipboard panel. This saves keystrokes in a normal bibliographic extraction. Press Esc or Right Click mouse button if you need to gain access to the editor cut/copy/paste functionality instead.
Advanced Features
Advanced features, and processing and extraction details are described in the following sections:
- Automatic Extraction: Questions and Answers
- Extracting Data from the Clipboard
- Processing of Author Names
- Processing of Journal Names
- Field Recognition Rules
- Regular Expression Editor
Configuration information is described in the following sections:
Utilities and modules are described in the following sections:
- Search BibTeX and PDF Document Files
- Embedded File Editor
- PDF Reference Import
- Reading and Writing Bibliographic Metadata
- cb2Bib Command Line
- cb2Bib Annote
- cb2Bib Citer
Requirements
Compilation
To compile cb2Bib, the following libraries must be present and accessible:
- Qt 5.7.0 or later from Qt Project. On a Linux platform with Qt preinstalled, make sure that the
develpackages and Qt tools are also present. - QtWebKit or QtWebEngine library (optional) to compile cb2Bib Annote viewer. No special action or flag is needed during compilation.
- Compression libraries LZ4 or LZO (optional). To chose a particular one, type
configure –enable-lz4orconfigure –enable-lzo. On machines with SSE4 instruction set, the LZSSE compressor can be used in place of LZ4 and LZO, by typingconfigure –enable-lzsse. If none of the above compressors were appropiate on a particular platform, typeconfigure –enable-qt-zlibbefore compiling. - On machines with AVX2 instruction set, consider using
configure –enable-avx2as this will improve cb2Bib search performance. - X11 header files if compiling on Unix platforms. Concretely, headers
X11/Xlib.handX11/Xatom.hare needed. - The header files
fcntl.handunistd.hfromglibc-develpackage are also required. Otherwise compilation will fail with‘::close’ undeclared.
Deployment
Although not needed for running cb2Bib, the following tools extend cb2Bib applicability:
MathJax, available at www.mathjax.org, for displaying mathematical notation. Simply, download and unzip it in a desired directory. See Configuring Annote.ExifTool, version 7.31 or later, available at exiftool.org, for metadata insertion.pdftotext, found packaged asxpdf, and downloadable from www.xpdfreader.com/download.html.- The
bib2xmlandxml2endBibUtils, for the postprocessing scriptbib2end.baton Windows platforms. - LaTeX packages, for checking BibTeX files correctness and for references printing through the shell script
bib2pdf.
Credits and License
The cb2Bib icons are taken from the Oxygen, Crystal SVG, and Noia icon sets, to be found at the KDE Desktop Environment. Several people has contributed with suggestions, bug reports or patches. For a detailed list of acknowledgments see the Change Log File.
The cb2Bib program is licensed under the terms of the GNU General Public License version 3.
Last updated on 2025-11-13.
First released version 0.1.0 on 2004-06-29.
© 2004-2025 Pere Constans