WordPress Author Style Imitator

A tool that analyzes an author's writing style from their WordPress.com blog posts and helps generate new content that mimics their unique voice.

Pipeline diagram showing the flow from WordPress.com API through style analysis to final content generation

How It Works

The tool follows a four-stage pipeline as shown in the diagram above:

Data Collection: Fetches posts and metadata from WordPress.com API
Content Classification: Uses local LLM to filter out confidential content
Style Analysis: Leverages Claude to generate comprehensive style instructions
Style Application: Applies captured style patterns to new content

Project Structure

src/: Contains the source code for the project
- auth/: Authentication related code.
- notebooks/: Jupyter notebooks for the main pipeline
- utils/: Utility functions and helpers
data/: Contains the processed data (not included in repo)
- author_posts/: Raw posts by author
- classified_posts/: Posts after confidentiality classification
- author_instructions/: Generated writing style instructions
- post_instructions/: Style-applied drafts

Setup

Clone this repository
Create a .env file in the root directory with:
- WPCOM_CLIENT_SECRET: WordPress.com API client secret
- WPCOM_ACCESS_TOKEN: WordPress.com API access token
- ANTHROPIC_API_KEY: Anthropic API key for Claude
Install required packages using Poetry: poetry install

Notebooks

Obtain WordPress.com API credentials and set up an app (see WordPress.com API documentation)
Get an Anthropic API key for Claude 3.5
Run auth_get_token.py to obtain an access token
Execute notebooks in the following order:

retrieve_posts.ipynb
llm_classify_posts.ipynb
llm_generate_author_prompts.ipynb
llm_apply_author_style.ipynb

retrieve_posts.ipynb

Fetches blog posts and engagement metrics from WordPress.com API. The notebook:

Retrieves posts using WordPress.com REST API
Collects metadata (views, likes, comments)
Handles rate limiting and pagination
Saves posts organized by domain and author
Processes cross-posts between different WordPress.com sites

llm_classify_posts.ipynb

Uses a local LLM to identify and filter out posts containing confidential information. The notebook:

Processes the top 50 posts by engagement score
Uses Ollama with Qwen 2.5 model for classification
Identifies sensitive/confidential content
Saves classification results with reasoning
Filters out confidential posts from further processing

llm_generate_author_prompts.ipynb

Analyzes non-confidential posts to generate comprehensive writing style instructions. The notebook:

Takes filtered posts from previous step
Uses Claude 3.5 to analyze writing patterns
Generates detailed style instructions covering:
- Tone and voice
- Content structure
- Language patterns
- Technical depth
- Engagement techniques
Saves instructions for use in content generation

llm_apply_author_style.ipynb

Applies the generated style instructions to new content. The notebook:

Takes a draft post as input
Uses style instructions and example posts
Leverages Claude 3.5 to rewrite content
Maintains technical accuracy while matching author's voice
Saves the style-applied output

Use Cases

Content Creation and Editing

Writing assistants that adapt to your team's voice
Draft refinement while maintaining consistent style
Technical documentation with consistent tone across teams
Blog post generation matching established author voices

Marketing and Communications

Content generation for different brand personas
Consistent messaging across multiple channels
Newsletter and marketing email writing
Campaign content that maintains brand voice

Learning and Development

Style analysis for writing improvement
Understanding different writing approaches
Learning from experienced writers' patterns
Continuous feedback on writing habits

Localization and Accessibility

Style-aware content adaptation
Maintaining voice consistency in translations
Adapting technical content for different audiences
Accessibility-focused content rewrites

Benefits and Insights

Style Preservation: Captures and maintains unique writing voices while generating new content
Learning Tool: Helps understand and learn from different writing styles
Efficiency: Streamlines content creation while maintaining consistency
Flexibility: Adapts to different writing styles and content types
Quality Control: Ensures consistent voice across multiple pieces of content

Note on Data Privacy

This repository intentionally excludes input and output data as the development used internal blog posts from Automattic. When using this tool, ensure you have appropriate permissions for the blog posts you analyze.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mimic_author_style.png		mimic_author_style.png
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WordPress Author Style Imitator

How It Works

Project Structure

Setup

Notebooks

retrieve_posts.ipynb

llm_classify_posts.ipynb

llm_generate_author_prompts.ipynb

llm_apply_author_style.ipynb

Use Cases

Content Creation and Editing

Marketing and Communications

Learning and Development

Localization and Accessibility

Benefits and Insights

Note on Data Privacy

License

About

Releases

Packages

Languages

License

gelbal/wordpress-author-style-imitate

Folders and files

Latest commit

History

Repository files navigation

WordPress Author Style Imitator

How It Works

Project Structure

Setup

Notebooks

retrieve_posts.ipynb

llm_classify_posts.ipynb

llm_generate_author_prompts.ipynb

llm_apply_author_style.ipynb

Use Cases

Content Creation and Editing

Marketing and Communications

Learning and Development

Localization and Accessibility

Benefits and Insights

Note on Data Privacy

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages