The lead: The content team at WSS.media aimed to test if GPT-4 could effectively scale up content production while reducing costs. Discover how our SEO editors leveraged GPT-4 to generate four types of content: blog posts, outreach articles, website copies, and rewrites. Some useful insights for SEO and content agencies inside. Read our case study and explore the potential of GPT-4 in your content strategy.
Hi there, I’m Dariya, Head of Content at WSS.media. For several months, my team and I delved into the world of GPT-4 in our cozy content department. With a burning desire to understand this tool inside and out (watch out, GPT-5, I’m coming for you), I wanted to determine if it could boost our clients’ content output without breaking the bank. Our clients enthusiastically agreed to participate in the test and were just as excited as we were to see the results.
And let me tell you, they were pretty interesting.
We’ve calculated everything we could possibly calculate, especially the time and money we’ve invested. It helped us to determine whether it’s profitable for SEO agencies to incorporate GPT-4 into their content department’s work.
But that’s not all.
As a Team Lead, I saw a golden opportunity to ease the burden on our in-house editors and reduce the time they spend writing texts.
I won’t bore you with a long introduction about what GPT-4 is and how amazing it is. I’m sure you already know that. Instead, let’s dive straight into our experience testing this powerful tool and its potential.
Before the Testing: Setting the Stage
This is how our SEO agency set up the GPT-4 content experiment. We’ll discuss the content types we chose and the criteria we used to measure success, making it easy for you to understand our approach. At WSS.media, we take challenges like this seriously.
Choosing Types of Texts
Initially, we identified the specific types of content we wanted to include in our GPT-4 testing process. Our goal was to diversify the study as much as possible to gather a wealth of information. That’s why we chose four of the most popular content types among our clients for the testing:
- blog posts — 20 texts;
- outreach articles — 40 texts;
- website copies — 40 texts;
- rewrites — 40 texts.
Determining Benchmarks
For any successful testing procedure, it’s essential to establish clear evaluation criteria. In this case study, we selected the following benchmarks to assess the content produced with the help of GPT-4:
- time spent on text creation (writing and editing);
- readability (the ease with which a reader can understand a written text; the Flesch Reading Ease score according to Readable);
- AI text detection (according to GPTradar);
- text originality;
- average cost of a final text (writing and editing);
- search engine indexing (in comparison with already published content);
- organic traffic (in comparison with already published content).
By focusing on these specific criteria, we aimed to uncover whether GPT-4 could truly enhance our clients’ content output without straining their budgets.
Organizing the Testing Process
When we had determined the types of content and evaluation criteria for our case study, it was crucial to devise a well-structured roadmap outlining the testing process.
Simply generating text wouldn’t provide the insights needed to answer our questions. Thus, we decided to split the content creation tasks evenly between our seasoned writers, who have long worked on our clients’ projects, and GPT-4.
This approach allowed us to accurately compare the results of human and AI-driven content creation while tackling the same tasks.
Documenting the Results
For tracking our GPT-4 testing results, we opted for Google Sheets. In our spreadsheet, we outlined hypotheses, addressed concerns, and described the content creation process for each text type.
If you want to replicate our experiment for your projects, feel free to use our template (to copy it, select “Edit” > “Copy”). We’re happy to share our findings with you.
Tip: Remember to assign responsibilities to team members for each step of the testing process. Setting deadlines helps avoid chaos as well. This approach ensures precise and accurate results.
All our steps helped integrate GPT-4 testing smoothly into our workflows. It didn’t become a separate task. Instead, our editors continued their usual work but with a new content creation tool.
Now it’s time to reveal how we worked with each type of content and the results we achieved.
Working with Rewrites
Rewrites often face prejudice, being associated with low-quality content. However, this is not true. This type of texts fulfills multiple goals, like creating new and original content from existing resources. This approach keeps brands engaged and relevant in the digital realm, capturing audience interest.
Additionally, SEO experts update content to keep up with the newest algorithms and keyword trends. This helps brands rise in search results, gaining more visibility.
Speaking of our experiment, the primary goal for AI at WSS.media was automating the rewriting of existing texts. Repeating information in new words can be tiresome for human beings, so we aimed for efficiency.
20 texts (500–600 words each) were rewritten by our editors, and the same number of texts were rephrased by GPT-4. We evaluated each piece based on the previously outlined criteria. For ease of reference, I have put together a table with average indicators:
Summing up the Results
I believe GPT-4 is perfectly suited for rewrites. With its assistance, we managed to reduce expenses and work time by threefold. GPT-4 is trained on vast amounts of data, enabling it to comprehend and process the meaning behind words and phrases. This understanding allows it to rephrase the content while preserving the original message and intent.
Hurdles Faced
Despite GPT-4 being ideally suited for rewriting existing texts, there are several aspects to bear in mind.
- Inconsistency in quality. While GPT-4 is generally effective at producing coherent text, it may occasionally generate content that is less relevant, repetitive, or poorly structured.
- Over-optimization. GPT-4 might create content that is overly optimized for certain keywords or phrases, which could lead to keyword stuffing or a negative user experience.
- Loss of original meaning. In some cases, GPT-4 may inadvertently alter the meaning or intent of the original content during the rewriting process.
Careful review and editing of the generated content are necessary to avoid these issues.
Crafting Outreach Articles
At WSS.media, our content team also writes outreach articles for link builders. These are short articles linked to a specific client’s website page, published on large authoritative platforms. Their main goal is to enhance the site’s backlink profile and authority, rather than attracting organic traffic.
Creating outreach articles is similar to rewriting, but we base them on multiple sources, usually 2-3 articles. For testing, we wrote 20 texts (500–600 words each) in the usual manner and 20 using GPT-4.
Summing up the Results
Using multiple references for rewriting turned out to be a more complex task for GPT-4. Nonetheless, the results were still quite optimistic. The AI managed to reduce the time needed to create a single outreach article by an hour and cut down the cost by $38.
Hurdles Faced
The challenges that the editors faced were the same as those encountered during the rewriting tests.
Creating Blog Posts
I believe that creating blog posts is one of the most complex tasks in the world of copywriting. The primary aim is to develop engaging and informativ
At first, the authors attempted to produce a complete article while adhering to the existing guidelines and structure. However, the final text fell short in terms of quality. Therefore, we composed the article in sections. It allowed for meticulous control over the text’s quality and avoided AI-content detection. And here are the results we obtained:
Summing up the Results
The data reveals that crafting articles with the aid of GPT-4 remarkably speeds up the process. Within the timeframe required to write a single piece using conventional methods (24 working hours), one can generate 3-4 materials using AI.
However, the savings per material are not remarkably high—merely $20. This is mainly due to the substantial amount of time and energy editors have to invest in polishing AI-generated texts.
Hurdles Faced
Our experience with GPT-4 in crafting blog articles was not entirely smooth. Here are some difficulties to keep in mind when creating similar AI-generated content:
- Unfamiliar references. Before writing a blog post, our editors conduct initial research to gain a deeper understanding of the topic and identify appropriate information sources. Nonetheless, ensuring the credibility and quality of these sources becomes difficult when producing AI-generated texts. Thus, be ready to thoroughly review the text after it is generated.
- Ignoring editorial policies. Many of our clients have well-defined editorial policies that need to be considered when composing articles. We supplied detailed information about these guidelines in the requests, but the outcomes did not always adhere to them.
- Challenges with niche topics. When an article focuses on a general subject, AI effortlessly gathers information, resulting in content of praiseworthy quality. Alas, GPT-4 falls short when tackling niche topics. The AI “fabricates” information, causing the material to lose its credibility.
- Constant repetition. This relates to both words and overall ideas. The AI tends to echo itself, using multiple similar words and crafting paragraphs that essentially mirror each other.
Writing Website Copies
Compelling web copy entices visitors to explore and purchase. The finished text should express your business’s purpose, create a comprehensive resource hub, make a positive first impression, and reap SEO benefits.
Our team usually writes web texts detailing brands’ values and their services, which involve unique, specialized information. We often interview clients for valuable insights to prepare for crafting these texts, making it a true challenge for GPT-4.
In our experiment, we chose 40 texts (200–400 words each), with half created by editors and the other half by AI. Here’s the average data we gathered:
Summing up the Results
GPT-4’s performance was disappointing in this case. The time it took to generate website content surpassed that of crafting similar texts by a writer. This was mainly due to the AI’s struggle to find relevant information, leading it to concoct content. Consequently, our editors had to spend considerable time correcting inaccuracies.
This challenge also contributed to the higher cost of producing website content.
Although GPT-4 may not excel in creating website texts from the ground up, it can rewrite existing content, making it simpler and more attractive.
Hurdles Faced
AI’s performance with website content wasn’t as successful as with rewrites and outreach articles, presenting several challenges:
- Overlooking company insights. AI solely depends on information from public domains, which poses a challenge for GPT-4 in crafting content. Startups and small businesses, often having limited data available on the web, find it particularly difficult to benefit from AI-generated content.
- Inconsistent tone of voice. Each company has a unique communication style with its audience. GPT-4 doesn’t always capture the necessary tone to engage website visitors effectively.
Even though there are only two hurdles, they deeply affect the process of crafting tailored texts for particular brands.
Overall Results
We never regretted testing GPT-4. We boldly integrated AI into our clients’ projects at WSS.media, instead of limiting it to internal use only. This enabled us to quickly evaluate the pros and cons of leveraging GPT-4 for our and our clients’ businesses. To summarize, here are the outcomes of the testing.
Time Spent on Text Creation
GPT-4 has greatly decreased the time required for content creation, particularly for tasks like rewriting (67% faster), outreach articles (33% faster), and blog posts (75% faster). However, regarding website text (20% faster), it is preferable to rely on traditional methods without AI, as it does not contribute to text quality.
Readability
In simple terms, readability is the ease of understanding a written piece. It involves factors like sentence structure, word choice, and layout, which collectively decide if a reader can comprehend the text effortlessly or finds it difficult.
ChatGPT Imitating Human Writers
From the summary table, it’s evident that humans slightly outperform GPT-4 in text readability as of now.
AI Text Detection
We previously detailed the process of how editors produced texts. They worked in segments to uphold quality and follow SEO guidelines. Consequently, no AI detector could recognize the text as GPT-4 generated.
Text Originality
GPT-4 always generates 100% original texts. It’s hard for a human to compete with AI in this aspect.
Average Cost of a Final Text
Unfortunately, GPT-4 wasn’t a cure-all. In certain instances, AI was successful in reducing costs, but in others, text generation proved to be pricier than conventional writing. However, a clear pattern emerges: the less post-generation editing required, the more cost-effective the production.
- rewrites — 67% more cost-effective;
- outreach articles — 33% more cost-effective;
- blog posts — 9% more cost-effective;
- websites copies — 20% more expensive.
Search Engine Indexing
We faced no issues with text indexing on Google. This metric remained consistent when comparing texts written by humans and those generated by GPT-4.
Organic Traffic
Every editor made sure that the texts generated by GPT-4 aligned with the requirements provided by the SEO specialist. As a result, these texts achieved rankings that were equally as high as the ones we produce manually.
Final Words
GPT-4 is an essential tool for copywriters and editors, but it’s important to know how to use it effectively. It’s not suitable for all types of texts and tasks, such as writings about small companies or niche-specific articles. Moreover, it’s crucial to formulate requests correctly and thoroughly review the generated results.
Answering the question in the title, yes, GPT-4 can increase content creation volume without a significant increase in expenses. But only in certain cases.
For instance, we immediately implemented AI for rewriting and creating outreach articles at WSS.media. This saves our clients’ money and allows our editors to focus on more interesting and complex tasks.
However, we are not yet planning to use AI for writing website texts and narrow-topic articles. The testing showed that GPT generates raw texts. They need to be refined, proofread, and edited with the help of a human editor.
At WSS.media, we love experimenting and testing because it’s the only way to provide clients with a modern efficient service that will help them achieve their goals and save money. If you’d like to work with us, feel free to reach out—we’re always available