ThoughtLoom: Transform Data into Insights with OpenAI LLMs

ThoughtLoom: Transform Data into Insights with OpenAI LLMs

ThoughtLoom is a powerful tool designed to foster creativity and enhance productivity through the use of LLMs directly from the command line. It facilitates rapid development and integration of LLM-based tools into various workflows, empowering individuals and teams to experiment, collaborate, and ultimately streamline their daily tasks.

By using ThoughtLoom's template system, teams can coordinate and fine-tune the most effective ways to leverage LLMs for their specific needs. This structured approach allows for seamless incorporation of LLMs into pipelines that can tackle increasingly complex objectives. ThoughtLoom's versatility enables users to create simple shell scripts or more sophisticated programs, depending on the requirements of the task at hand.

Ultimately, ThoughtLoom accelerates the adoption, experimentation, and practical use of LLMs by providing a flexible, user-friendly environment to build and share LLM-based tools. It's an invaluable resource for anyone looking to harness the power of language models in their daily work, whether for personal, team, or enterprise applications.

Demo

Features

  • Puts the power of OpenAI chat-gpt APIs in your terminal
  • Template and configuration system so you can define and invoke arbitrary processing tasks
  • Parallel / batch processing of large data sets
  • Estimate API costs before running commands
  • JSON in, JSON out

Installation

go install github.com/tbiehn/thoughtloom@latest

Some GPT Thoughts on ThoughtLoom

ThoughtLoom is a powerful command-line tool that brings the potential of OpenAI APIs to the fingertips of power-users and professionals, enabling a more efficient and robust way to generate insights and narratives. While it may not revolutionize entire industries on its own, it does provide an enhanced user experience for those looking to leverage OpenAI APIs for practical, real-world applications.

  1. Streamlining Content Creation: ThoughtLoom allows writers, marketers, and other content creators to quickly generate drafts, summaries, and article ideas, boosting productivity and reducing the time spent on mundane tasks.

  2. Data Analysis Made Simple: Researchers, data analysts, and business professionals can use ThoughtLoom to extract valuable insights from complex datasets, making the process of data analysis more efficient and accessible.

  3. Enhancing Customer Support: ThoughtLoom can be utilized to analyze customer feedback and support queries, generating valuable insights that can help improve customer service and drive customer satisfaction.

  4. Automating Reporting: Business users can leverage ThoughtLoom to generate automated reports on key performance indicators, trends, and competitor analysis, saving time and resources.

  5. Simplifying Documentation: Developers, product managers, and technical writers can use ThoughtLoom to generate technical documentation, code comments, and API descriptions with ease, improving the overall development process.

ThoughtLoom is a versatile tool designed to empower power-users and professionals in their day-to-day tasks, making it easier to harness the capabilities of OpenAI APIs. While it may not change the world, it can certainly make a significant difference for those who use it, by simplifying complex tasks, improving productivity, and providing a more user-friendly experience.

Usage

ThoughtLoom is a powerful command-line tool designed to process JSON data using predefined templates and connect to OpenAI (or Azure) APIs to generate valuable insights. The following steps outline the general usage of ThoughtLoom:

  1. Supply input JSON data through stdin.
  2. Set your OpenAI API key using the 'OPENAI_API_KEY' environment variable or Azure AI API key using the 'AZUREAI_API_KEY' environment variable.

WARNING: API requests may incur costs. Use the flag '-d' to estimate your potential cost before running the actual process.

Command-line Flags

  • -ae string: Set the Azure HTTP Endpoint if using Azure. Set the environment variable 'AZUREAI_API_KEY' to your API key.
  • -am string: Set the model deployment name if using Azure.
  • -b: Disable the progress bar. Set -b all by itself to enable it.
  • -c string: Set the path to the configuration file (default ./config.toml).
  • -d: Perform a dry run, calculating token usage without making a request. Set -d all by itself to enable it.
  • -l string: Set the log level (options: debug, info, warn, error, fatal, panic) (default "info").
  • -p int: Set how many parallel calls to make to OpenAI (default 5).

Configs and Templates

Configuration files are toml that specify templates to use when turning JSON inputs into OpenAI queries.

# Required parameters
max_tokens = 2048                      # The maximum number of tokens for the generated response.
model = "gpt-3.5-turbo"                # The model to use for text generation, e.g., gpt-3.5-turbo, gpt-4, etc.

# Optional parameters (if not provided, the program will use default values)
temperature = 0.7                      # Controls the randomness of the generated text.
top_p = 0.9                            # Controls the nucleus sampling, which limits the token selection to a subset.
presence_penalty = 0.1                 # Penalty for using the same token multiple times.
frequency_penalty = 0.1                # Penalty for using tokens that are less common.

template_system = "./system.tmpl"      # The path to the system prompt template file.
template_user = "./user.tmpl"          # The path to the user prompt template file.

Optionally, few-shot examples can be specified by using a slightly different format;

max_tokens = 2048
model = "gpt-3.5-turbo"
template_system = "./system.tmpl"
temperature = 0.8

[[template_prompt]] 
	role = "user"
	template = "./1shotuser.tmpl"

[[template_prompt]] 
	role = "assistant"
	template = "./1shotassistant.tmpl"

[[template_prompt]] 
	role = "user"
	template = "./user.tmpl"
	

Here 1shotuser.tmpl and 1shotassistant.tmpl show an example user query and response. It's one way to do few-shot prompting, refer to the semgrep2fix example.

I like to play around with prompts and parameters using the OpenAI Playground and then throw them into templates and refine.

Template files are simple golang text/template files that get passed in 'inflated' JSON objects. You'll see what that looks like best by examing our 'Creating a Job' example.

A system prompt is the 'system prompt' - it gives the model hard instructions that are more likely to be mutated by user instructions.

A user prompt contains the information for the job, or the specific subtask of the job to be done.

Earlier OpenAI documentation indicates that 'system' roles are not respected. This behavior seems to have improved.

Quick text/template Primer

To reference a root element in a passed in JSON object named item you write {{.item}}

To reference elements of an array in a passed in JSON object {"items":[{"idx":"1"},{"idx":"2"}]} you go for {{range .items}}{{.idx}}{{end}}

Or for a pure array of strings {"items":["1","2"]} you can use {{range .items}}{{.}}{{end}}

For more, check out the golang text\template reference.

Creating a Job

The template format in ThoughtLoom consists of two parts: the system template and the user template. The system template specifies the prompt or question for the LLM, while the user template defines the structure of the data being fed into the LLM. Both templates are written in the TOML configuration file format, and they are used together with ThoughtLoom's command-line tool to process JSON data.

A novel example using weather data:

  1. Prepare weather data as input:
{
  "data": [
    {
      "temperature": "65",
      "humidity": "45",
      "conditions": "sunny",
      "timestamp": "2023-04-15T09:00:00.000Z"
    },
    {
      "temperature": "72",
      "humidity": "40",
      "conditions": "partly cloudy",
      "timestamp": "2023-04-15T12:00:00.000Z"
    },
    {
      "temperature": "76",
      "humidity": "35",
      "conditions": "sunny",
      "timestamp": "2023-04-15T15:00:00.000Z"
    }
  ]
}
  1. Create weather.toml
template_system = "./weather_system.tmpl"
template_user = "./weather_user.tmpl"
max_tokens = 2048
model = "gpt-3.5-turbo"
  1. Create weather_system.tmpl
Summarize the weather for the day, highlighting the overall conditions and any significant changes.
  1. Create weather_user.tmpl
Weather data:
{{range .data}}
- Temperature: {{.temperature}}°F, Humidity: {{.humidity}}%, Conditions: {{.conditions}}, Timestamp: ({{.timestamp}})
{{end}}
  1. Run the example
cat 'input_weather.json' | thoughtloom -c './weather.toml' > 'result_weather.json'

Selected Examples

Some interesting examples selected to demonstrate capability.

Fan-out Fan-in

Our whitepaper example creates a 'Table of Contents', then 'fans-out' requests across those topics to write a larger article.

Our nuclei2results example processes each nuclei finding in parallel, summarizes, then 'fans-in' to reason across the population of results.

Provided Examples

Check out the following examples in ./examples/

Crypto Prices

This example demonstrates how to use ThoughtLoom to transform a few cryptocurrency prices into a buy or sell rationale.

Extract Author Persona

This example demonstrates how to use ThoughtLoom to analyze a set of writing samples and generate a summary of the author's persona and writing style. The script processes each article to extract author persona insights and then summarizes them into a single description, helping LLMs emulate the author's distinct style, perspective, and philosophical impulses.

Transform Nuclei Scan Results into a Report with an Executive Summary

This example demonstrates how to use ThoughtLoom to transform Nuclei scan results into a report with individual findings and an executive summary.

Generate Code Fix Policies from Semgrep Rules and Examples

This example shows 1-shot prompting that produces a generic code fix policy from the LLM's training set, considering a rule and the code test cases that often accompany them.

Generate Code Fix Patches from Semgrep Issues

This example shows how to automatically generate policy-following patches for defects identified with semgrep using ThoughtLoom.

Generate a Whitepaper

This example demonstrates how to use ThoughtLoom to generate a full whitepaper from just a few bullet points.

License

ThoughtLoom is released under Apache 2.0.

Similar Resources

At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.

At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.

School of SRE In early 2019, we started visiting campuses across India to recruit the best and brightest minds to ensure LinkedIn, and all the service

Dec 30, 2022

A k8s vault webhook is a Kubernetes webhook that can inject secrets into Kubernetes resources by connecting to multiple secret managers

A k8s vault webhook is a Kubernetes webhook that can inject secrets into Kubernetes resources by connecting to multiple secret managers

k8s-vault-webhook is a Kubernetes admission webhook which listen for the events related to Kubernetes resources for injecting secret directly from sec

Oct 15, 2022

A helm v3 plugin to adopt existing k8s resources into a new generated helm chart

helm-adopt Overview helm-adopt is a helm plugin to adopt existing k8s resources into a new generated helm chart, the idea behind the plugin was inspir

Dec 15, 2022

A Kubernetes Operator, that helps DevOps team accelerate their journey into the cloud and K8s.

A Kubernetes Operator, that helps DevOps team accelerate their journey into the cloud and K8s.

A Kubernetes Operator, that helps DevOps team accelerate their journey into the cloud and K8s. OAM operator scaffolds all of the code required to create resources across various cloud provides, which includes both K8s and Non-K8s resources

Nov 30, 2021

A set of components that can be composed into a highly available metric system with unlimited storage capacity

A set of components that can be composed into a highly available metric system with unlimited storage capacity

Overview Thanos is a set of components that can be composed into a highly available metric system with unlimited storage capacity, which can be added

Oct 20, 2021

k6-to-honeycomb is a program that sends k6 results into Honeycomb for visualization and analysis.

k6-to-honeycomb is a program that sends k6 results into Honeycomb for visualization and analysis.

k6-to-honeycomb k6-to-honeycomb is a program that sends k6 results into Honeycomb for visualization and analysis. Getting Started k6-to-honeycomb is a

Jul 14, 2022

GitHub Action: Compose multiple (conditional) checks into a single check based on file paths in a pull request

GitHub Action: Compose multiple (conditional) checks into a single check based on file paths in a pull request

GitHub Action: Composite Example Usage --- name: All Checks on: pull_request: branches: - main jobs: meta: runs-on: - ubuntu-20.

Dec 29, 2022

Turn a heterogeneous pile of text docs into a single web page with good search.

Turn a heterogeneous pile of text docs into a single web page with good search.

Codex Codex turns an unstructured pile of heterogeneous documents into a single interactive web document. Your input documents maybe in markdown, TeX,

Jan 2, 2022
Substation is a cloud native toolkit for building modular ingest, transform, and load (ITL) data pipelines

Substation Substation is a cloud native data pipeline toolkit. What is Substation? Substation is a modular ingest, transform, load (ITL) application f

Dec 30, 2022
Transform latin letters to runes & vice versa. Go version.

Riimut Transform latin letters to runes & vice versa. Go version. Includes transformers for four main runic alphabets: Elder Futhark Younger Futhark M

Aug 2, 2022
Hexagonal architecture paradigms, such as dividing adapters into primary (driver) and secondary (driven)Hexagonal architecture paradigms, such as dividing adapters into primary (driver) and secondary (driven)

authorizer Architecture In this project, I tried to apply hexagonal architecture paradigms, such as dividing adapters into primary (driver) and second

Dec 7, 2021
HBase Exporter,fetch data from jmx for region-level data.

HBase Exporter Prometheus exporter for HBase which fetch data from hbase jmx, written in Go. You can even see region-level metrics. Installation and U

Nov 4, 2022
Kstone is an etcd management platform, providing cluster management, monitoring, backup, inspection, data migration, visual viewing of etcd data, and intelligent diagnosis.
Kstone is an etcd management platform, providing cluster management, monitoring, backup, inspection, data migration, visual viewing of etcd data, and intelligent diagnosis.

Kstone 中文 Kstone is an etcd management platform, providing cluster management, monitoring, backup, inspection, data migration, visual viewing of etcd

Dec 27, 2022
Conduit - Data Integration for Production Data Stores
Conduit - Data Integration for Production Data Stores

Conduit Data Integration for Production Data Stores. ?? Overview Conduit is a da

Jan 3, 2023
💧 Visual Data Preparation (VDP) is an open-source tool to seamlessly integrate Vision AI with the modern data stack
💧 Visual Data Preparation (VDP) is an open-source tool to seamlessly integrate Vision AI with the modern data stack

Website | Community | Blog Get Early Access Visual Data Preparation (VDP) is an open-source tool to streamline the end-to-end visual data processing p

Jan 5, 2023
k8s-image-swapper Mirror images into your own registry and swap image references automatically.
k8s-image-swapper Mirror images into your own registry and swap image references automatically.

k8s-image-swapper Mirror images into your own registry and swap image references automatically. k8s-image-swapper is a mutating webhook for Kubernetes

Dec 27, 2022
Translate Prometheus Alerts into Kubernetes pod readiness

prometheus-alert-readiness Translates firing Prometheus alerts into a Kubernetes readiness path. Why? By running this container in a singleton deploym

Oct 31, 2022
Vilicus is an open source tool that orchestrates security scans of container images(docker/oci) and centralizes all results into a database for further analysis and metrics.
Vilicus is an open source tool that orchestrates security scans of container images(docker/oci) and centralizes all results into a database for further analysis and metrics.

Vilicus Table of Contents Overview How does it work? Architecture Development Run deployment manually Usage Example of analysis Overview Vilicus is an

Dec 6, 2022