Well, the FATE side of the house has released its first two studies, including one entitled “Face Analysis Technology Evaluation (FATE) Part 10: Performance of Passive, Software-Based Presentation Attack Detection (PAD) Algorithms” (NIST Internal Report NIST IR 8491; PDF here).
Machine learning models need training data to improve their accuracy—something I know from my many years in biometrics.
And it’s difficult to get that training data—something else I know from my many years in biometrics. Consider the acronyms GDPR, CRPA, and especially BIPA. It’s very hard to get data to train biometric algorithms, so they are trained on relatively limited data sets.
At the same time that biometric algorithm training data is limited, Kevin Indig believes that generative AI large language models are ALSO going to encounter limited accessibility to training data. Actually, they are already.
The lawsuits have already begun
A few months ago, generative AI models like ChatGPT were going to solve all of humanity’s problems and allow us to lead lives of leisure as the bots did all our work for us. Or potentially the bots would get us all fired. Or something.
But then people began to ask HOW these large language models work…and where they get their training data.
Just like biometric training models that just grab images and associated data from the web without asking permission (you know the example that I’m talking about), some are alleging that LLMs are training their models on copyrighted content in violation of the law.
I am not a lawyer and cannot meaningfully discuss what is “fair use” and what is not, but suffice it to say that alleged victims are filing court cases.
Comedian and author Sarah Silverman, as well as authors Christopher Golden and Richard Kadrey — are suing OpenAI and Meta each in a US District Court over dual claims of copyright infringement.
The suits alleges, among other things, that OpenAI’s ChatGPT and Meta’s LLaMA were trained on illegally-acquired datasets containing their works, which they say were acquired from “shadow library” websites like Bibliotik, Library Genesis, Z-Library, and others, noting the books are “available in bulk via torrent systems.”
This could be a big mess, especially since copyright laws vary from country to country. This description of copyright law LLM implications, for example, is focused upon United Kingdom law. Laws in other countries differ.
Systems that get data from the web, such as Google, Bing, and (relevant to us) ChatGPT, use “crawlers” to gather the information from the web for their use. ChatGPT, for example, has its own crawler.
But that only includes the sites that blocked the crawler when Originality AI performed its analysis.
More sites will block the LLM crawlers
Indig believes that in the future, the number of the top 1000 sites that will block ChatGPT’s crawler will rise significantly…to 84%. His belief is based on analyzing the business models for the sites that already block ChatGPT and assuming that other sites that use the same business models will also find it in their interest to block ChatGPT.
The business models that won’t block ChatGPT are assumed to include governments, universities, and search engines. Such sites are friendly to the sharing of information, and thus would have no reason to block ChatGPT or any other LLM crawler.
The business models that would block ChatGPT are assumed to include publishers, marketplaces, and many others. Entities using these business models are not just going to turn it over to an LLM for free.
One possibility is that LLMs will run into the same training issues as biometric algorithms.
In biometrics, the same people that loudly exclaim that biometric algorithms are racist would be horrified at the purely technical solution that would solve all inaccuracy problems—let the biometric algorithms train on ALL available biometric data. In the activists’ view (and in the view of many), unrestricted access to biometric data for algorithmic training would be a privacy nightmare.
Similarly, those who complain that LLMs are woefully inaccurate would be horrified if the LLM accuracy problem were solved by a purely technical solution: let the algorithms train themselves on ALL available data.
Could LLMs buy training data?
Of course, there’s another solution to the problem: have the companies SELL their data to the LLMs.
In theory, this could provide the data holders with a nice revenue stream while allowing the LLMs to be extremely accurate. (Of course the users who actually contribute the data to the data holders would probably be shut out of any revenue, but them’s the breaks.)
But that’s only in theory. Based upon past experience with data holders, the people who want to use the data are probably not going to pay the data holders sufficiently.
Google and Meta to Canada: Drop dead / Mourir
By The original uploader was Illegitimate Barrister at Wikimedia Commons. The current SVG encoding is a rewrite performed by MapGrid. – This vector image is generated programmatically from geometry defined in File:Flag of Canada (construction sheet – leaf geometry).svg., Public Domain, https://commons.wikimedia.org/w/index.php?curid=32276527
Even today, Google and Meta (Facebook et al) are greeting Canada’s government-mandated Bill C-18 with resistance. Here’s what Google is saying:
Bill C-18 requires two companies (including Google) to pay for simply showing links to Canadian news publications, something that everyone else does for free. The unprecedented decision to put a price on links (a so-called “link tax”) breaks the way the web and search engines work, and exposes us to uncapped financial liability simply for facilitating access to news from Canadian publications….
As a result, we have informed them that we have made the difficult decision that, when the law takes effect, we will be removing links to Canadian news publications from our Search, News, and Discover products.
Google News Showcase is the program that gives money to news organizations in Canada. Meta has a similar program. Peter Menzies notes that these programs give tens of millions of (Canadian) dollars to news organizations, but that could end, despite government threats.
The federal and Quebec governments pulled their advertising spends, but those moves amount to less money than Meta will save by ending its $18 million in existing journalism funding.
Bearing in mind that Big Tech is reluctant to give journalistic data holders money even when a government ORDERS that they do so…
…what is the likelihood that generative AI algorithm authors (including Big Tech companies like Google and Microsoft) will VOLUNTARILY pay funds to data holders for algorithm training?
If Kevin Indig is right, LLM training data will become extremely limited, adversely affecting the algorithms’ use.
All too often, Bredemarket confines its writing discussions to the traditional ABCW (articles, blog posts, case studies, white papers) categories.
But what if your content needs are non-traditional and fall outside of the usual nice neat business writing categories?
From the 2023 Route 66 Cruisin’ Reunion, Saturday, September 16, 2023.
If you are an Inland Empire business who needs words, but not in the traditional “ABCW” (articles, blog posts, case studies, white papers) business types, Bredemarket will help you with your non-traditional writing needs.
Take a look at the examples I’ve provided below, and if these spark interest within you, authorize Bredemarket, Ontario California’s content marketing expert, to help your firm produce words that return results.
Book a meeting with me at calendly.com/bredemarket. Be sure to fill out the information form so I can best help you. For example, if you’re an Inland Empire business requiring non-traditional content, fill out the form accordingly.
I won’t go into all 22 types again, especially since some of them are internal content rather than customer-facing content. But I’d like to highlight the “ABCW” four types that I mentioned at the beginning of this blog post, plus a couple of others.
Articles and blog posts
I’m lumping articles and blog posts together, because while some “experts” try to draw hard-and-fast distinctions between the two, they’re pretty much the same thing.
Whether it’s a blog post on your website, a post or article on LinkedIn, or even some extended text associated with an Instagram picture or a TikTok video, what you’re creating is some text that entertains, persuades, inspires, or educates your reader, or perhaps all four. You set the goal for the article or blog post, then tailor the content to meet the goal. (I’ll talk more about goals later.)
Case studies
From “How Bredemarket Can Help You Win Business,” available via this post.
Case studies show your readers how your solution was applied to someone else’s problem, and how your solution can benefit your prospects with similar problems.
Maybe your prospect is a city police agency that needs a tool to solve crimes, and your case study describes how your solution solved crimes in a similar city. Again, you set the goal for the case study, then tailor the content to meet the goal.
White papers
On the surface, white papers are informational, but when a company issues a white paper, the “information” that the white paper provides should gently guide the reader toward doing business with the company that issued the paper. Using the example above, you could write a white paper that outlines “Five Critical Elements for a Local Crime-Solving Solution.” By remarkable coincidence, your own solution happens to include all five of those critical elements. Again, you set the goal and tailor the content.
Briefs, data sheets, and literature sheets
One-page sheet for the Bredemarket 400 Short Writing Service. More information here.
Perhaps you need to provide handouts to your prospects that describe your product or service.
Regardless of whether you call these handouts briefs, data, sheets, literature sheets, or something else, they should at a minimum contain both “educate” and “persuade” elements—educate your prospects on the benefits of your product or service, and persuade your prospects to move closer to a sale (conversion).
Again, you set the goal and tailor the content.
Web page content
If your business has a web page, I hope that it has more words than “Under construction.” Whether you have imagery, video, audio, text, or all four on your web page, it needs to answer the questions that your prospects and customers have.
You know what I’m going to say here, but it’s still important. You set the goal and tailor the content.
But…what if your business needs content that doesn’t fall into these traditional business categories?
Non-traditional content: going to a car show
I went to a car show this weekend—specifically, this year’s Route 66 Cruisin’ Reunion in downtown Ontario, California. (Yes, I know that Route 66 actually passed three miles north of downtown Ontario, but work with me here.)
While some of the exhibitors were personal, some of them were businesses. As businesses, what was the major marketing collateral that they generated?
Not a blog post, or LinkedIn article, or any of the traditional business media collateral.
In addition to the car itself, this exhibitor included poster boards with words describing the car.
Another exhibitor did the same thing.
So while these car show exhibitors didn’t choose a traditional way to convey their words, they shared written text anyway.
Your non-traditional business communication needs
Maybe you don’t have a classic car. Maybe you don’t have a car at all. Do you need to share words with your prospects and customers anyway?
Now I don’t know your business communication needs. You do. But I can guess a few things.
Do you need to tell your clients/potential clients why you do what you do?
Do you need to tell them how you do it?
And last but not least, do you need to tell them what you do?
I know that this may seem like an unusual order to you. Why not start with what you do?
Because your customers don’t care about what you do. Your customers care about themselves.
If you keep the focus on your customers, the answer to the “why” question will induce your customers to care about you, because it shows how you can solve their problems.
Let’s illustrate this.
Why and how Bredemarket creates non-traditional content
You may be asking why I create content in the first place. There are countless content creators, both human and non-human. Why turn to me when OpenAI and its bot buddies are a lot cheaper and faster?
Normally I include my recent professional picture, but I have been writing since my college days (on a typewriter back then).
Bredemarket’s service is independent of content type. I don’t have a “Bredemarket blog writing service” or “Bredemarket data sheet writing service” or “Bredemarket case study writing service.” My services are based on word length, not content type, with my most popular service targeted to customers who need between 400 and 600 words of text. From this perspective, I don’t care if you want the words to appear on your website or your social media channel or a paper flyer or a sign next to your car or a really really long banner towed behind an airplane. (Read about the Bredemarket 400 Short Writing Service here.)
Before I write a thing, I ask your some questions. It won’t surprise you to learn that my first questions to you are why, how, and what. I then move on to questions about your goal for the content, the benefits of your solution, the target audience for your solution, and many additional questions. (Read about the Six Questions Your Content Creator Should Ask You here.)
Once the questions are out of the way, content creation is collaborative and iterative. I create a draft, you review it, and we repeat. The Bredemarket 400 service includes two review cycles; longer content needs include three review cycles. The goal is to ensure that both of us are happy with the final product.
Bredemarket’s process applies regardless of the specific content type, so I should be able to support whatever content you need, whether it’s traditional or non-traditional.
From the 2022 Cruisin’ Reunion in Ontario, California. The 2023 edition takes place this weekend.
(Updated blog post count 10/23/2023)
There are many ways for Inland Empire firms to raise awareness about their offerings. For certain firms, blogging provides quantifiable benefits. Can your firm take advantage of blogging’s fresh immediacy?
In most cases, I can provide your blog post via my standard package, the Bredemarket 400 Short Writing Service. I offer other packages and options if you have special needs.
Get in touch
Authorize Bredemarket, Ontario California’s content marketing expert, to help your firm produce words that return results.
Identity and biometrics firms can achieve quantifiable benefits with prospects by blogging. Over 40 identity and biometrics firms are already blogging. Is yours?
These firms (and probably many more) already recognize the value of identity blog post writing, and some of them are blogging frequently to get valuable content to their prospects and customers.
Is your firm on the list? If so, how frequently do you update your blog?
In most cases, I can provide your blog post via my standard package, the Bredemarket 400 Short Writing Service. I offer other packages and options if you have special needs.
Get in touch with Bredemarket
Authorize Bredemarket, Ontario California’s content marketing expert, to help your firm produce words that return results.
To discuss your identity/biometrics blog post needs further,book a meeting with me at calendly.com/bredemarket. On the questionnaire, select the Identity/biometrics industry and Blog post content.
Always take advantage of your competitors’ weaknesses.
This post describes an easy way to take advantage of your competitors. If they’re not blogging, make sure your firm is blogging. And the post provides hard numbers that demonstrate why your firm should be blogging.
Which means that half of those companies don’t have a public corporate blog.
The same infographic also revealed the following:
86% of B2B companies are blogging. (Or, 14% are not.)
68% of social media marketers use blogs in their social media strategy. (Or, 32% don’t.)
45% of marketers saying blogging is the #1 most important piece of their content strategy.
Small businesses under 10 employees allocate 42% of their marketing budget to content marketing.
So obviously some firms believe blogging is important, while others don’t.
What difference does this make for your firm?
What results do blogging companies receive?
In my view, the figures above are way too low. 100% of all Fortune 500 companies, 100% of B2B companies should be blogging, and 100% of social media marketers should incorporate blogging.
Getting leads from blogging is nice, but show me the money! What about conversions?
Marketers who have prioritized blogging are 13x more likely to enjoy positive ROI.
92% of companies who blog multiple times per day have acquired a customer from their blog.
Take a look at those last two bullets related to conversion again. Blogging is correlated with positive ROI (I won’t claim causation, but anecdotally I believe it), and blogging helps firms acquire customers. So if your firm wants to make money, get blogging.
What should YOUR company do?
With numbers like this, shouldn’t all companies be blogging?
But don’t share these facts with your competitors. Keep them to yourself so that you gain a competitive advantage over them.
Now you just need to write those blog posts.
How can I help?
And if you need help with the actual writing, I, John E Bredehoft of Bredemarket, can help.
And if you’re not in the identity/biometric industry, my general content marketing expertise also applies to technology firms and general business firms.
In most cases, I can provide your blog post via my standard package, the Bredemarket 400 Short Writing Service. I offer other packages and options if you have special needs.
Authorize Bredemarket, Ontario California’s content marketing expert, to help your firm produce words that return results.
But are computerized systems any better, and can they detect spoofed voices?
Well, in the same way that fingerprint readers worked to overcome gummy bears, voice readers are working to overcome deepfake voices.
This is only the beginning of the war against voice spoofing. Other companies will pioneer new advances that will tell the real voices from the fake ones.
As for independent testing:
ID R&D has participated in multiple ASVspoof tests, and performed well in them.
I’ve performed product marketing since 2015 (arguably earlier), and I performed that other similar-sounding role, product management, from 2000 to 2009. The two roles certainly have similarities such as customer focus, but they may be different.
Or may not. There’s no standard job description for a product marketer, and product marketing needs vary between companies.
Ignoring your prospects is NOT a winning business strategy. But a lot of companies do it anyway by not communicating regularly with their prospects.
If you ignore your prospects, your prospects will ignore you.
Meetings and money, via a third party
Of my three Bredemarket meetings (so far) today, the second was the most promising.
A person at a large company needs consulting services from me. All we need to do is work out the mechanics. The large company relies on a third party to manage its indpendent contractor relationships, including onboarding, time cards, and payments for hourly work. I wanted to learn about the third party, but I ran into walls when seeking current information about the firm.
The third party’s website is static
The third party’s website talks about its services, some unique aspects about the business, the story of its founder (a fascinating story), its technology partners, and its call to action. It provides ALMOST everything…with the exception of CURRENT information.
Luckily for me, I knew where to find current information on the company. Since the company is a B2B provider, I assumed that the company has a LinkedIn page. And I was right. But…
The third party’s LinkedIn page is also static
As you probably know, company LinkedIn pages have several subpages. The “About” supage talks about the third party company’s services, and the “People” subpage links to the profiles of the company’s employees, including the founder. So I went to the “Posts” subpage for the third party…
Use other social media outlets: TikTok, X, YouTube, whatever.
Pay attention to your prospects by providing current content.
If you ignore your prospects, your prospects will ignore you.
Are you ready to stop ignoring your prospects?
If you need help creating content for your blog, your social media platforms, or your website, Bredemarket can help you regain credibility with your prospects and customers.
Authorize Bredemarket, Ontario California’s content marketing expert, to help your firm produce words that return results.