“The State Bar of California announced Friday that its beleaguered leader, who has faced growing pressure to resign over the botched February roll out of a new bar exam, will step down in July. Leah T. Wilson, the agency’s executive director, informed the Board of Trustees she will not seek another term in the position she has held on and off since 2017. She also apologized for her role in the February bar exam chaos.”
This is a remote education post, but not an educational identity post.
I have previously discussed online test taking, and I guess the State Bar of California reads the Bredemarket blog because it decided that an online bar exam would be a great idea, since it would reduce the costs of renting large halls for test taking purposes.
“The online testing platforms repeatedly crashed before some applicants even started. Others struggled to finish and save essays, experienced screen lags and error messages and could not copy and paste text from test questions into the exam’s response field — a function officials had stated would be possible.”
No surprise, but the remote bar exam debacle was so bad that students are filing…lawsuits.
“Some students also filed a complaint Thursday in the U.S. District Court for the Northern District of California, accusing Meazure Learning, the company that administered the exam, of “failing spectacularly” and causing an “unmitigated disaster.””
Machine learning models need training data to improve their accuracy—something I know from my many years in biometrics.
And it’s difficult to get that training data—something else I know from my many years in biometrics. Consider the acronyms GDPR, CRPA, and especially BIPA. It’s very hard to get data to train biometric algorithms, so they are trained on relatively limited data sets.
At the same time that biometric algorithm training data is limited, Kevin Indig believes that generative AI large language models are ALSO going to encounter limited accessibility to training data. Actually, they are already.
The lawsuits have already begun
A few months ago, generative AI models like ChatGPT were going to solve all of humanity’s problems and allow us to lead lives of leisure as the bots did all our work for us. Or potentially the bots would get us all fired. Or something.
But then people began to ask HOW these large language models work…and where they get their training data.
Just like biometric training models that just grab images and associated data from the web without asking permission (you know the example that I’m talking about), some are alleging that LLMs are training their models on copyrighted content in violation of the law.
I am not a lawyer and cannot meaningfully discuss what is “fair use” and what is not, but suffice it to say that alleged victims are filing court cases.
Comedian and author Sarah Silverman, as well as authors Christopher Golden and Richard Kadrey — are suing OpenAI and Meta each in a US District Court over dual claims of copyright infringement.
The suits alleges, among other things, that OpenAI’s ChatGPT and Meta’s LLaMA were trained on illegally-acquired datasets containing their works, which they say were acquired from “shadow library” websites like Bibliotik, Library Genesis, Z-Library, and others, noting the books are “available in bulk via torrent systems.”
This could be a big mess, especially since copyright laws vary from country to country. This description of copyright law LLM implications, for example, is focused upon United Kingdom law. Laws in other countries differ.
Systems that get data from the web, such as Google, Bing, and (relevant to us) ChatGPT, use “crawlers” to gather the information from the web for their use. ChatGPT, for example, has its own crawler.
But that only includes the sites that blocked the crawler when Originality AI performed its analysis.
More sites will block the LLM crawlers
Indig believes that in the future, the number of the top 1000 sites that will block ChatGPT’s crawler will rise significantly…to 84%. His belief is based on analyzing the business models for the sites that already block ChatGPT and assuming that other sites that use the same business models will also find it in their interest to block ChatGPT.
The business models that won’t block ChatGPT are assumed to include governments, universities, and search engines. Such sites are friendly to the sharing of information, and thus would have no reason to block ChatGPT or any other LLM crawler.
The business models that would block ChatGPT are assumed to include publishers, marketplaces, and many others. Entities using these business models are not just going to turn it over to an LLM for free.
One possibility is that LLMs will run into the same training issues as biometric algorithms.
In biometrics, the same people that loudly exclaim that biometric algorithms are racist would be horrified at the purely technical solution that would solve all inaccuracy problems—let the biometric algorithms train on ALL available biometric data. In the activists’ view (and in the view of many), unrestricted access to biometric data for algorithmic training would be a privacy nightmare.
Similarly, those who complain that LLMs are woefully inaccurate would be horrified if the LLM accuracy problem were solved by a purely technical solution: let the algorithms train themselves on ALL available data.
Could LLMs buy training data?
Of course, there’s another solution to the problem: have the companies SELL their data to the LLMs.
In theory, this could provide the data holders with a nice revenue stream while allowing the LLMs to be extremely accurate. (Of course the users who actually contribute the data to the data holders would probably be shut out of any revenue, but them’s the breaks.)
But that’s only in theory. Based upon past experience with data holders, the people who want to use the data are probably not going to pay the data holders sufficiently.
Google and Meta to Canada: Drop dead / Mourir
By The original uploader was Illegitimate Barrister at Wikimedia Commons. The current SVG encoding is a rewrite performed by MapGrid. – This vector image is generated programmatically from geometry defined in File:Flag of Canada (construction sheet – leaf geometry).svg., Public Domain, https://commons.wikimedia.org/w/index.php?curid=32276527
Even today, Google and Meta (Facebook et al) are greeting Canada’s government-mandated Bill C-18 with resistance. Here’s what Google is saying:
Bill C-18 requires two companies (including Google) to pay for simply showing links to Canadian news publications, something that everyone else does for free. The unprecedented decision to put a price on links (a so-called “link tax”) breaks the way the web and search engines work, and exposes us to uncapped financial liability simply for facilitating access to news from Canadian publications….
As a result, we have informed them that we have made the difficult decision that, when the law takes effect, we will be removing links to Canadian news publications from our Search, News, and Discover products.
Google News Showcase is the program that gives money to news organizations in Canada. Meta has a similar program. Peter Menzies notes that these programs give tens of millions of (Canadian) dollars to news organizations, but that could end, despite government threats.
The federal and Quebec governments pulled their advertising spends, but those moves amount to less money than Meta will save by ending its $18 million in existing journalism funding.
Bearing in mind that Big Tech is reluctant to give journalistic data holders money even when a government ORDERS that they do so…
…what is the likelihood that generative AI algorithm authors (including Big Tech companies like Google and Microsoft) will VOLUNTARILY pay funds to data holders for algorithm training?
If Kevin Indig is right, LLM training data will become extremely limited, adversely affecting the algorithms’ use.
What does AdvoLogix say about using AI in the workplace?
AdvoLogix’s post is clear in its intent. It is entitled “9 Ways to Use AI in the Workplace.” The introduction to the post explains AdvoLogix’s position on the use of artificial intelligence.
Rather than replacing human professionals, AI applications take a complementary role in the workplace and improve overall efficiency. Here are nine actionable ways to use artificial intelligence, no matter your industry.
I won’t list ALL nine of the ways—I want you to go read the post, after all. But let me highlight one of them—not the first one, but the eighth one.
Individual entrepreneurs can also benefit from AI-driven technologies. Entrepreneurship requires great financial and personal risk, especially when starting a new business. Entrepreneurs must often invest in essential resources and engage with potential customers to build a brand from scratch. With AI tools, entrepreneurs can greatly limit risk by improving their organization and efficiency.
The AdvoLogix post then goes on to recommend specific ways that entrepreneurs can use artificial intelligence, including:
AI shopping
Use AI Chatbots for Customer Engagement
Regardless of how you feel about the use of AI in these areas, you should at least consider them as possible options.
Why did AdvoLogix write the post?
Obviously the company had a reason for writing the post, and for sharing the post with people like me (and like you).
AdvoLogix provides law firms, legal offices, and public agencies with advanced, cloud-based legal software solutions that address their actual needs.
Thanks to AI tools like Caster, AdvoLogix can provide your office with effective automation of data entry, invoicing, and other essential but time-consuming processes. Contact AdvoLogix to request a free demo of the industry’s best AI tools for law offices like yours.
So I’m not even going to provide a Bredemarket call to action, since AdvoLogix already provided its own. Good for AdvoLogix.
But what about Steven Schwartz?
The AdvoLogix post did not specifically reference Steven Schwartz, although the company stated that you should control the process yourself and not cede control to your artificial intelligence tool.
Roberto Mata sued Avianca airlines for injuries he says he sustained from a serving cart while on the airline in 2019, claiming negligence by an employee. Steven Schwartz, an attorney with Levidow, Levidow & Oberman and licensed in New York for over three decades, handled Mata’s representation.
But at least six of the submitted cases by Schwartz as research for a brief “appear to be bogus judicial decisions with bogus quotes and bogus internal citations,” said Judge Kevin Castel of the Southern District of New York in an order….
In late April, Avianca’s lawyers from Condon & Forsyth penned a letter to Castel questioning the authenticity of the cases….
Among the purported cases: Varghese v. China South Airlines, Martinez v. Delta Airlines, Shaboon v. EgyptAir, Petersen v. Iran Air, Miller v. United Airlines, and Estate of Durden v. KLM Royal Dutch Airlines, all of which did not appear to exist to either the judge or defense, the filing said.
Schwartz, in an affidavit, said that he had never used ChatGPT as a legal research source prior to this case and, therefore, “was unaware of the possibility that its content could be false.” He accepted responsibility for not confirming the chatbot’s sources.
Schwartz is now facing a sanctions hearing on June 8.
Behind that smiling face beats the heart of an opinionated, crotchety, temperamental writer.
When you’ve been writing, writing, and writing for…um…many years, you tend to like to write things yourself, especially when you’re being paid to write.
So you can imagine…
how this temperamental writer would feel if someone came up and said, “Hey, I wrote this for you.”
how this temperamental writer would feel if someone came up and said, “Hey, I had ChatGPT write this for you.”
So how do you think that I feel about ChatGPT, Bard, and other generative AI text writing tools?
Actually, I love them.
But the secret is in knowing how to use these tools.
Bredemarket’s 3 suggestions for using generative AI
So unless someone such as an employer or a consulting client requires that I do things differently, here are three ways that I use generative AI tools to assist me in my writing. You may want to consider these yourself.
Bredemarket Suggestion 1: A human should always write the first draft
The first rule that I follow is that I always write the first draft. I don’t send a prompt off and let a bot write the first draft for me.
Obviously pride of authorship comes into play. But there’s something else at work also.
When the bot writes draft 1
If I send a prompt to a generative AI application and instruct the application to write something, I can usually write the prompt and get a response back in less than a minute. Even with additional iterations, I can compose the final prompt in five minutes…and the draft is done!
And people will expect five-minute responses. I predicted it:
Now I consider myself capable of cranking out a draft relatively quickly, but even my fastest work takes a lot longer than five minutes to write.
“Who cares, John? No one is demanding a five minute turnaround.”
Not yet.
Because it was never possible before (unless you had proposal automation software, but even that couldn’t create NEW text).
What happens to us writers when a five-minute turnaround becomes the norm?
Now what happens when, instead of sending a few iterative prompts to a tool, I create the first draft the old-fashioned way? Well obviously it takes a lot longer than five minutes…even if I don’t “sleep on it.”
But the entire draft-writing process is also a lot more iterative and (sort of) collaborative. For example, take the “Bredemarket Suggestion 1” portion of the post that you’re reading right now.
It originally wasn’t “Bredemarket Suggestion 1.” It was “Bredemarket Rule 1,” but then I decided not to be so dictatorial with you, the reader. “Here’s what I do, and you MAY want to do it also.”
And I haven’t written this section, or the rest of the post, in a linear fashion. I started writing Suggestion 3 before I started the other 2 suggestions.
I’ve been jumping back and forth throughout the entire post, tweaking things here and there.
Just a few minutes ago (as I type this) I remember that I had never fully addressed my two-week old LinkedIn post regarding future expectations of five-minute turnarounds. I still haven’t fully addressed it, but I was able to repurpose the content here.
Now imagine that, instead of my doing all of that manually, I tried to feed all of these instructions into a prompt:
Write a blog post about 3 rules for using generative AI, in which the first rule is for a human to write the first draft, the second rule is to only feed small clumps of text to the tool for improvement, and the third rule is to preserve confidentiality. Except don’t call them rules, but instead use a nicer term. And don’t forget to work in the story about the person who wrote something in ChatGPT for me. Oh, and mention how ornery I am, but use three negative adjectives in place of ornery. Oh, and link to the Writing, Writing, Writing subsection of the Who I Am page on the Bredemarket website. And also cite the LinkedIn post I wrote about five minute responses; not sure when I wrote it, but find it!
What would happen if I fed that prompt to a generative AI tool?
You’ll find out at the end of this post.
Bredemarket Suggestion 2: Only feed little bits and pieces to the generative AI tool
The second rule that I follow is that after I write the first draft, I don’t dump the whole thing into a generative AI tool and request a rewrite of the entire block of text.
Instead I dump little bits and pieces into the tool.
Such as a paragraph. There are times when I may feed an entire paragraph to a tool, just to look at some alternative ways to say what I want to say.
Or a sentence. I want my key sentences to pop. I’ll use generative AI to polish them until they shine.
The “code snippet” (?) rewrite that created the sentence above, after I made a manual edit to the result.
Or the title. You can send blog post titles or email titles to generative AI for polishing. (Not my word.) But check them; HubSpot flagged one generated email title as “spammy.”
Or a single word. Yes, I know that there are online thesauruses that can take care of this. But you can ask the tool to come up with 10 or 100 suggestions.
Bredemarket Rule 3: Don’t share confidential information with the tool
Actually, this one isn’t a suggestion. It’s a rule.
Remember the “Hey, I had ChatGPT write this for you” example that I cited above? That actually happened to me. And I don’t know what the person fed as a prompt to ChatGPT, since I only saw the end result, a block of text that included information that was, at the time, confidential.
OK, not THAT confidential. By July_12,_2007_Baghdad_airstrike_unedited_part1.ogv: US Apache helicopterderivative work: Wnt (talk) – July_12,_2007_Baghdad_airstrike_unedited_part1.ogv, Public Domain, https://commons.wikimedia.org/w/index.php?curid=9970435
Did my “helper” feed that confidential information to ChatGPT, allowing it to capture that information and store it in its systems?
Let’s say that Bredemarket is developing a new writing service, the “Bredemarket 288 Tweet Writing Service.” (I’m not. It’s not economically feasible. But bear with me.)
Now this is obviously an extremely valuable trade secret.
If someone scouring generative AI data found out about this offering and beat me to the punch, I would lose $45 billion. Or maybe less.
So how should I have a generative AI tool edit text about my new service?
First, don’t use a Bredemarket account to submit the prompt. Even if I follow all the obfuscation steps that I am about to list below, the mere fact that the prompt was associated with a Bredemarket account links Bredemarket to the data.
Second, if the word “Bredemarket” appears in the prompt, change it to something else. Like my standby WidgetCo, or maybe Wildebeest Inc.
Third, obfuscate other parts of the prompt. Perhaps change 288 (a number closely associated with modern-day Twitter) to something else, and maybe change other things also.
To show how this is done, here’s something I just submitted to a well-known generative AI tool.
Convert the following to eye-catching text with 250 or fewer characters: “Wildebeest Inc. is pleased to announce that we are providing a brand new service for you, the Wildebeest 176 YouTube Video Creation Service. This service, priced at $50, allows me to collaborate with you in creating a YouTube video. Before I start work, I ask you a number of questions about your desired video, including where the video will be shown, the features you want to highlight, and the target audience. I prepare seventeen drafts during the process, and you can review each of them. The result is a YouTube video that meets your needs and brings your company a lack of profits.”
So my collaborative service provided…a code snippet!
Code snippet
**Wildebeest 176 YouTube Video Creation Service**
**Get a custom YouTube video for $50**
* Collaborate with a video creator
* 17 drafts for review
* Meet your needs and bring profits
**Sign up today!**
Use code with caution.
Now I can edit the text and insert the real words, adjusting them as needed, without revealing my super-secret project.
Would a temperamental writer really do all of this?
Yes, a temperamental writer would really do all of this.
Despite my (overly?) high opinion of my own written work vs. something a bot would write, in certain circumstances the bot can improve my writing.
And as long as I disclose to a potential Bredemarket client (or an employer) my three suggestions (whoops, two suggestions and one rule) for using generative AI, there should be no ethical or legal problem in using a tool. In a sense it’s like using online grammar correction tools, or a book like a dictionary or thesaurus.
Roberto Mata sued Avianca airlines for injuries he says he sustained from a serving cart while on the airline in 2019, claiming negligence by an employee. Steven Schwartz, an attorney with Levidow, Levidow & Oberman and licensed in New York for over three decades, handled Mata’s representation.
But at least six of the submitted cases by Schwartz as research for a brief “appear to be bogus judicial decisions with bogus quotes and bogus internal citations,” said Judge Kevin Castel of the Southern District of New York in an order….
In late April, Avianca’s lawyers from Condon & Forsyth penned a letter to Castel questioning the authenticity of the cases….
Among the purported cases: Varghese v. China South Airlines, Martinez v. Delta Airlines, Shaboon v. EgyptAir, Petersen v. Iran Air, Miller v. United Airlines, and Estate of Durden v. KLM Royal Dutch Airlines, all of which did not appear to exist to either the judge or defense, the filing said.
Schwartz, in an affidavit, said that he had never used ChatGPT as a legal research source prior to this case and, therefore, “was unaware of the possibility that its content could be false.” He accepted responsibility for not confirming the chatbot’s sources.
Schwartz is now facing a sanctions hearing on June 8.
Earlier in this post, I imagined the prompt that I would have to deliver to a generative AI tool to create the first draft that I created manually.
Write a blog post about 3 rules for using generative AI, in which the first rule is for a human to write the first draft, the second rule is to only feed small clumps of text to the tool for improvement, and the third rule is to preserve confidentiality. Except don’t call them rules, but instead use a nicer term. And don’t forget to work in the story about the person who wrote something in ChatGPT for me. Oh, and mention how ornery I am, but use three negative adjectives in place of ornery. Oh, and link to the Writing, Writing, Writing subsection of the Who I Am page on the Bredemarket website. And also cite the LinkedIn post I wrote about five minute responses; not sure when I wrote it, but find it!
Are you wondering what I’m wondering?
So I fed the prompt above into Google Bard. The result begins with the next paragraph. I think my version is more entertaining, but the bot version is admittedly more succinct, if less accurate (especially the “ornery” paragraph). And after a lot of thought, I canned this version of the post entirely and came up with a more succinct, mostly manually-created version instead. You can read it here.
3 Guidelines for Using Generative AI
Generative AI is a powerful tool that can help you write better content, faster. However, it’s important to use it responsibly. Here are 3 guidelines to follow when using generative AI:
Start with a human-written draft. Generative AI is best used to improve existing content, not to create it from scratch. Start by writing a draft of your content, then use generative AI to polish it.
Feed small clumps of text to the tool. Generative AI can be inaccurate if you feed it too much text at once. Instead, feed it small clumps of text, and then review and edit the results.
Prioritize confidentiality. Generative AI can be used to create confidential content. If you are using generative AI to create confidential content, be sure to take steps to protect it.
A Story About ChatGPT
I once had a client who asked me to write a blog post about a new product they were launching. I was really busy at the time, so I decided to use ChatGPT to help me write the post. I gave ChatGPT a brief overview of the product, and it generated a draft of the post in about 10 minutes. The post was well-written and informative, and it saved me a lot of time.
Ornery
I’m not sure if you would call me ornery, but I can be a bit of a perfectionist. I like to make sure that my work is the best it can be, and I’m not afraid to ask for help when I need it.
Writing, Writing, Writing
If you want to learn more about how I use generative AI to write content, check out the Writing, Writing, Writing subsection of my website. I also wrote a LinkedIn post about five minute responses. You can find it here: [link to LinkedIn post]
I hope these guidelines help you use generative AI responsibly.