cb2Bib: Overview
The cb2Bib is a tool for rapidly extracting unformatted, or
unstandardized bibliographic references from email alerts, journal Web pages, and
PDF files.
Current version: cb2Bib 1.0.2. (See ChangeLog File for a detailed list of changes and
acknowledgments, and Release Notes
for additional notes and information.)
See also Release Note
cb2Bib 1.0.0.
The cb2Bib
reads the clipboard text contents and process it against a set of predefined
patterns. If this automatic detection is successful, cb2Bib formats the clipboard
data according to the structured BibTeX reference standard.
Otherwise, if no predefined format pattern is found or if detection proves to be
difficult, manual data extraction is greatly simplified by cb2Bib. In most cases,
such manual data extraction will provide with a new, personalized pattern to be
included within the predefined pattern set for future automatic extractions.
Once the bibliographic reference is correctly extracted, it is added to a
specified BibTeX database file. Optionally, document files are renamed to its citeID
and moved to a desired directory as a personal article library, and their metadata
is updated with the bibliographic reference. See Configuring Documents section.
- Select the reference to import from the email or web browser
On Unix machines, cb2Bib automatically detects mouse selections and clipboard
changes. On Windows machines, copy or Ctrl-C is necessary to activate cb2Bib
automatic processing.
- cb2Bib automatic processing
Once text is selected cb2Bib initiates the automatic reference extraction. It uses
the predefined patterns from file regexp.txt to attempt automatic
extraction. See Configuring
Files section for setting the user predefined pattern matching expression file.
After a successful detection bibliographic fields appear on the cb2Bib item line
edits. Manual editing is possible at this stage.
- cb2Bib manual processing
If no predefined format pattern is found or if detection proves to be difficult, a
manual data extraction must be performed. Select individual reference items from
the cb2Bib clipboard area. A popup menu will appear after selection is made. Choose
the corresponding bibliographic field. See BiBTeX Entry Types available as cb2Bib fields.
Selection is postprocessed and added to the cb2Bib item line edit. cb2Bib field
tags will show on the cb2Bib clipboard area. Once the manual processing is done,
cb2Bib clipboard area will contain the matching pattern. The pattern can be further
edited and stored to the regexp.txt file using Insert Regular
Expression, Alt+I. See the Extracting
Data from the Clipboard and The
Regular Expression Editor sections.
- Download reference to cb2Bib
The cb2Bib has the built-in functionality to interact with publishers "Download
reference to Citation Manager" service. Choose BibTeX format, or any other format
that you can translate using External Clipboard Preparsing Command. See
Additional, Keyboard
Functionality, Alt C. Click "Download" from your browser. When asked
"Open with..." select cb2Bib. The cb2Bib will be launched if no running instance is
found. If already running, it will place the downloaded reference to the clipboard,
and it will start processing. Make sure your running instance is aware of clipboard
changes. See Buttons
Functionality. For convenience, the shell script dl_cb2bib, and
the desktop config file dl_cb2bib.desktop are also provided.
- Adding documents
PDF and other documents can be added to the BibTeX reference by dragging the file
icon and dropping it into the cb2Bib's panel. Optionally, document files, are
renamed to its citeID and moved to a desired directory as a personal article
library (See Configuring Documents section). Linked
to a reference documents correspond to the BibTeX tag file. Usual
reference manager software will permit to retrieve and visualize these files.
Download, copy and/or moving is scheduled and performed once the reference is
accepted, e.g., once it is saved by pressing Save Reference button.
- Multiple retrieving from PDF files
Multiple PDF or convertible to text files can be sequentially processed by dragging
a set of files into cb2Bib's PDFImport dialog. By starting the processing button,
files are sequentially converted to text and send to cb2Bib clipboard panel for
reference extraction. See PDF
Reference Import for details.
- Journal-Volume-Page Queries
Takes input Journal, Volume, and first page from the corresponding edit lines and
attempts to complete the reference. Additionally, queries consider Title, DOI, and
an <it>excerpt</it>, which is a simplified clipboard panel contents.
See Configuring
Network section, the distribution file netqinf.txt, and Release Note cb2Bib 0.3.5 for
customization and details.
- BibTeX Editor
cb2Bib includes a practical text editor suitable for corrections and additions.
cb2Bib capabilities are readily available within the editor. E.g., the reference is
first sent to cb2Bib by selecting it, and later retrieved from cb2Bib to the editor
using 'right click' + 'Paste Current BibTeX'. Interconversions Unicode <->
LaTeX, long <-> abbreviated journal name, and adding/renaming PDF files are
easily available. BibTeX Editor is also accessible through a shell command line.
See The cb2Bib Command Line and
Embedded File Editor.
- About
About cb2Bib, bookmarks, and online help
- Search references
Opens the cb2Bib's search dialog. The search is performed either on the current
BibTeX file, or for all BibTeX files on the current directory. Optionally, the
search is extended to reference's files. Hits are displayed on an editor window.
See Search BibTeX files for
references. See also Configuring Utilities section to
configure the external to text converter.
- PDFImport
Launches cb2Bib's PDFImport window. Files dragged into PDFImport window are
sequentially translated to text and sent to cb2Bib clipboard panel. The cb2Bib
automatic and manual capabilities are then easily available to extract and
supervise reference extractions. See PDF Reference Import.
- Dis/Connect Clipboard
Toggles automatic cb2Bib and desktop clipboard connection. While the automatic
cb2Bib-clipboard connection permits reducing keystrokes, the disconnected mode is
needed in cases where multiple mouse selections or copies are required to complete
a reference extraction. See also Release Note cb2Bib 0.4.1 and Release Note cb2Bib 0.2.1 if you
experience problems with this feature.
- Network Reference Query
Starts Network Query. It usually takes input Journal, Volume, and first page from
the corresponding edit lines and attempts to complete the reference. See Configuring Network network
section to customize querying. See the distribution file netqinf.txt
and also Release Note cb2Bib
0.3.5 for the details.
- View BibTeX Reference
View current reference as will be output to the BibTeX file. Eventual manual
changes should be done on the item line edit.
- Save Reference
Inserts the current bibliographic reference to the output BibTeX file. This action
decides whether or not a reference is accepted. Scheduled actions such as PDF
downloading, copying or renaming will be performed at this time.
- Open BibTeX File
Opens the current BibTeX output file. Right click within the BibTeX Editor window
for its particular functionality. See also Embedded File Editor.
- Alt B
Edits the Bookmarks and Network Query Info file netqinf.txt.
- Alt C
Preparses cb2Bib's clipboard through a user specified external script or tool.
Preparsing is necessary to catch formatted references that can not be easily
extracted using recognition patterns, or that are written in ambiguous formats.
Many available scripts or specific user-written tools can be incorporated to cb2Bib
through this external preparsing capability. In addition, simple, one-line scripts
can be used within PDFImport to provide, for instance, the journal name when
missing from the PDF first page. The cb2Bib distribution contains the sample
scripts isi2bib and ris2bib that convert ISI and RIS
formatted strings to BibTeX. See Configuring Clipboard for
details.
- Alt D
Deletes temporary BibTeX output file. This permits using cb2Bib output files as
temporary media to transfer references to a preferred reference manager and
preferred format. Caution: This feature is not intended for the users that
actually store their references in one or several BibTeX files. Remember to import
references prior to delete cb2Bib output file.
- Alt E
Edits the regular expression file. It permits an easy access and modification of
stored extraction patterns. New patterns are conveniently added to to the regular
expression file by using the RegExp Editor button functionality.
- Alt F
Launches a file dialog for selecting the source file name for the BibTeX entry
file. Selected files are displayed either, as the actual source
filename, or, as the target filename, depending on the file copy/rename/move
settings. See Configuring Documents. Alternatively
to Alt F, documents can be easily linked to a reference by dragging the
document file and dropping it to the cb2Bib panel.
- Alt J
Edits the Journal Abbreviations file.
- Alt O
Opens the currently linked document for browsing. Documents can be easily linked to
a reference by dragging the document file and dropping it to the cb2Bib panel, or
with Alt F. Linked documents correspond to the BibTeX tag
file.
- Alt P
Postprocess BibTeX output file. It launches a user specified script or program to
postprocess the current BibTeX file. The cb2Bib distribution contains two sample
scripts. One, bib2pdf is a shell script for running latex
and bibtex; this permits to check the BibTeX file for possible errors,
and to easily produce a suitable output for printing. The other one,
bib2end.bat is a batch script for running bib2xml and
xml2end, which converts references into Endnote format. See Configuring BibTeX for
details.
- Alt R
Restarts the cb2Bib automatic engine. Takes input data not from the system
clipboard but from the cb2Bib clipboard panel. This permits editting the input
stream from poorly translated PDF captions, correcting for author superscripts, or
helps in debugging regular expressions.
- Esc
Quits cb2Bib popup menu. The cb2Bib menu pops up each time a selection in made in
the clipboard panel. This saves keystrokes in a normal bibliographic extraction.
Press Esc or Right Click mouse button if you need to gain access to
the editor cut/copy/paste functionality instead.
This section describes the details of the internal cb2Bib
recognition scheme.
- Extracting Data from the
Clipboard
- Processing of author's
names
- Processing of journal
names
cb2Bib usefullness increases when having a set of reliable regular
expressions. It can therefore be interesting to share one's favorite regexps among
cb2Bib users. If you have a working -which does not mean perfect- regexp that could
benefit other users, please take a moment and fill out the RegExp Submission Form. These regexp will be later
included into the cb2Bib distribution, as received, without any additional editing.
Interested users could then copy/paste needed cb2Bib regexps into their own regexp
file. In this way, no much of anybody's time and effort should be needed.
To compile cb2Bib, the following libraries
must be present and accessible:
- Qt 4.3.0 or higher from Trolltech. On a Linux platform with Qt preinstalled, make sure that
the
devel packages and Qt tools are also present.
- X11 header files if compiling on Unix platforms. Concretely, headers
X11/Xlib.h and X11/Xatom.h are needed, unless cb2Bib is
configured as ./configure --disable_cbpoll. On a native MacOSX and
Windows cbpoll is already disabled by default.
- The header files
fcntl.h and unistd.h from
glibc-devel package are also required. Otherwise compilation will fail
with referencelist.cpp:227: `close' undeclared.
Although not needed for running cb2Bib, the
following tools extend cb2Bib applicability:
- The
bib2xml and xml2end BibUtils, to test the postprocess script bib2end.bat on
Windows platforms.
- ... and LaTeX and friends, to check for BibTeX file correctness and to get a
nice printing through the shell script
bib2pdf.
The
cb2Bib icons are taken from the Oxygen, Crystal SVG, and
Noia icon sets, to be found at the KDE Desktop Environment. Several people has contributed with
suggestions, bug reports or patches. For a detailed list of acknowledgments see the
ChangeLog File.
The cb2Bib program is licensed under the terms of the GNU General Public
License version 3.
The cb2Bib, Pere Constans, Copyright © 2004-2008. All rights
reserved.
First realeased, version 0.1.0 on 2004-06-29.
Last updated on 2008-07-21.
|