A data processing tool.

Mark George authored on 24 Nov 2017
dist Added save/load for expressions. 7 years ago
src Refactoring into packages. New main class. 7 years ago
README.md Added save/load for expressions. 7 years ago
test-data.txt Added new JAR, and test data. 7 years ago
README.md

JRex

Provides a handy user interface for manipulating text based data.

It is currently fully functional and appears to be quite robust, but was knocked up with features being added on an as-needed quick-and-dirty basis so the code needs a bit of a cleanup.

Features

  • Can group data into columns using capturing regular expressions (column per capture group).
  • Can reformat the captured groups using a custom format expression.
  • Can transpose the rows and columns.
  • Can pivot the data (both wide to long, and long to wide).
  • Can perform bulk replacements (using regular expressions if required).
  • Convenience features for working with CSV (comma-separated) and TSV (tab-separated) data.

Launching

The binary JAR file is found in the dist folder of this repository.

It should run on any version of Java from version 7 onwards.

The JAR file in the dist folder should be executable. If your desktop environment is correctly configured you should be able to double click the JAR file to run it. If not, run the following from the command line:

java -jar jrex.jar

You can use the following to get a dark themed version:

java -jar jrex.jar dark

Quick Start

  1. Copy/paste some CSV into the input text box.
  2. Enter the number of columns into the number spinner and click the CSV button.
  3. You will see the regular expression being used appear in the Capture Expression text box, and the captured groups will appear in the table.
  4. Click the TSV button in the output section. This will produce a format expression in the Format Expression text box and show the TSV output in the output text box.

You can use your own capture/format expressions, and click the Capture/Format buttons to apply them.

Basic Usage

  1. Copy and paste your text into the input text box. You can also use the Open button to load data from a file.
  2. Enter your regular expression for grouping the data into the Capture Expression text box. Refer to the Java API Pattern documentation for the details on Java's regular expression syntax.
  3. Click Capture. Captured groups will appear in the table.
  4. Enter an expression into the Format Expression text box to define the output format. Refer to captured groups using the Column Prefix and the column number as shown in the table. e.g. ?1 is the first column.
  5. Click Format.

Search/Replace

Check the Search/Replace box to show the pane.

Add search and replace expressions (can be regular expressions including captures and references) into the table. Capture references use the '$' character as the prefix (see Java API Matcher.replaceAll). Literal '$' characters will need to be escaped with a backslash ('\'). Rows can be added and deleted, and moved (replacements are made in the order that they occur in the table) using the appropriate buttons.

The Replace button will apply the replacements to the data in the input text box and the result will appear in the output text box.

Note that each replacement is applied to the entire input. Regular expressions are executed with the MULTILINE flag enabled by default (^ and $ apply to lines instead of the entire input), but this can be disabled via the (?-m) flag at the start of the expression.

Other Features

  • The output can be moved into the input box at any point by clicking the Output to Input button.
  • The output can be saved to a file at any point using the Save button.
  • Tab characters can be inserted into any text box via the right click menu.
  • The Clear button will clear the input prior to adding new data.
  • Empty cells can be replaced with a new value using the Replace blanks? feature.
  • The Save/Load Expressions buttons can be used to save and reuse the expressions and replacements for common tasks.

Match Flags

Listed here, since I can never remember what they are. These can be added to a regular expression to change the match behaviour:

  • (?i) - Case insensitive matching.
  • (?s) - DOTALL mode where the dot character ('.') matches line breaks.
  • (?m) - MULTILINE mode where '^' and '$' match the start and end of a line instead of the start and end of the entire input.