Submitted by aicharades t3_10vnkj8 in MachineLearning

update: www.wrotescan.com is now updated with the ChatGPT API!

hi all! I created a simple free tool where you can summarize and query documents of any size and estimate the cost to do so: https://www.wrotescan.com

You can edit the prompts as well as automatically chunk and combine documents. There's also a cost estimator for any pdf you upload.

Let me know if you want me to run some examples for you! Send me a pdf and tell me what you'd like summarized or extracted.

Tips

Please be sure to keep {text} in both prompts or the program will not input your document's text into the map reduce summarizer. {text} can only appear once in each prompt. It is where the text from each chunk to be summarized is input into the prompts.

Create a temporary OpenAI key / org to use with this site so you do not have to provide credit card information then be sure to delete the temp key when you are done.

Learnings

Some interesting learnings I had while creating the tool:

- Minimizing the number of steps through the AI improved summarization, so map reduce was often better than a more advanced refine workflow which passes the output through the model many more times.

- LangChain is great for managing multiple step language model calls and bypassing the current limitations of ChatGPT

55

Comments

You must log in or register to comment.

aicharades OP t1_j7ig8eb wrote

I was really interested in how this could summarize primary sources, so I ran it on the database of Hillary Clinton's emails.

A Storyteller Summarizes Hillary Clinton’s Emails

Input: Raw PDF of 2016 Democratic National Committee email leak

------------------------

Prompt: I want you to act as a storyteller. You will come up with entertaining stories that are engaging, imaginative and captivating for the audience based on the summary text provided. It can be fairy tales, educational stories or any other type of stories which has the potential to capture people's attention and imagination. You have been given a section of a large set of emails to summarize and come up with an entertaining story.

------------------------

Output: Once upon a time, there was a group of diplomats working hard to make the world a better place. They had been discussing how to gain leverage over the Pakistanis to move them to more aggressively pursue Al-Qaeda. After much deliberation, they decided to use a combination of incentives and coercive measures to try to get the Pakistanis to cooperate. They offered incentives such as increased control over how Kerry-Lugar money was used and increased support for counterinsurgency capabilities. They also used coercive measures such as targeting military leaders who gave support to Al-Qaeda and its allies, and attempting to reduce Pakistani anxieties about the Indian front. In the end, their efforts paid off and the Pakistanis began to cooperate more with their mission. The diplomats had succeeded in making the world a better place.

The story follows Hillary as she juggles her busy schedule, responding to requests from her colleagues and friends. She is always willing to help, and her dedication to her work is evident. Despite her busy schedule, she is still able to find time to help those around her. A powerful leader was tasked with finding out whether a senator supported the surge in Iraq and obtaining a copy of his recent speech about troops to Afghanistan. The leader reached out to her team for help, and they worked together to draft a memo outlining four strategic interests, three of which overlapped with the leader's goals. With the help of her team, the leader was able to obtain the information she needed and use it to make a positive impact.

President Obama was faced with the challenge of how to win the war in Afghanistan without causing too much destruction and civilian casualties. He decided to increase the size of the Afghan army and police force, and to pay them better wages so they could live comfortably. He also decided to send more US Special Forces and tens of thousands of troops to the region. Despite these efforts, the Taliban continued to gain strength and support from the Pashtun community. Obama realized that he needed to find a better solution. He decided to focus on carefully thought out measures such as providing better pay for the Afghan army and police, and using diplomacy to negotiate with the Taliban. In the end, Obama was able to find a peaceful solution that allowed the US to win the war without causing too much destruction.

A Prime Minister of a country had a fear of losing his sight due to a childhood rugby injury. He had his eyes checked and it was found that he had two minor tears in his retina. He visited the world-renowned Moorfields Hospital for further examinations and the doctors decided against any further operations. The Prime Minister was grateful to the doctors and staff of the NHS, especially Moorfields Hospital. Meanwhile, the US Secretary of State was in contact with the Foreign Minister of Honduras to discuss the situation in the country. She asked her staff to call the US Congressman Payne to update him on the situation. Through her diplomatic efforts, she was able to make a positive impact on the lives of many.

A small island nation called Haiti faced many challenges, but the people of Haiti were determined to build a stable, prosperous and democratic future. The United Nations appointed President Clinton as a special envoy to Haiti, and he appointed Dr. Paul Farmer as the Deputy U.N. Special Envoy. Dr. Farmer had dedicated his life to helping the people of Haiti for the past 20 years. Despite the efforts of the U.S., U.N. and other countries, Haiti still faced many challenges. To help the people of Haiti, the U.S. State Department took a closer look at the challenges and Secretary Clinton's chief of staff briefed Members on the initial findings. The U.S. hoped to better target and coordinate assistance, engage the Haitian Diaspora, and make assistance sustainable. To help the people of Haiti, the U.S. proposed granting temporary protected status to Haitians living in the U.S. and included Haiti in security initiatives such as the Merida Initiative. With the help of the U.S., U.N. and other countries, the people of Haiti could build a brighter future.

There was also a great conflict between two powerful nations, Morocco and Guinea, over a passport that had been taken away from a citizen of Guinea. The leaders of the United States and France stepped in to help and offered ideas to help the two nations come to a resolution, but the leaders of Morocco and Guinea were still unable to come to an agreement. The leaders of the United States and France then decided to meet in Paris to discuss the situation further and were able to broker a deal between Morocco and Guinea, returning the passport to its rightful owner.

There was also a man named Tom who was nominated for a prestigious position, but his nomination was held up due to a disagreement between two powerful forces. A brave negotiator was able to convince the two sides to lift the hold on Tom's nomination, allowing him to achieve his dream.

21

mamaBiskothu t1_j7kbrwm wrote

Wait why are you continuously calling it chatGPT if it’s not using chatGPT? In an ML sub to boost?

20

TheEdes t1_j7mz548 wrote

This sub is over, it's been taken over by users and startups trying to promote their own products rather than researchers.

9

wittfm t1_j7ilmnm wrote

Are you using chatgpt or it is just a bait in the title? In any case, good work

14

aicharades OP t1_j7invof wrote

This is a series of open source libraries that can extend OpenAI completions model: https://langchain.readthedocs.io/en/latest/

You can make your own ChatGPT with its own reference library versus the current pre-2022 snapshot.

It’s possible to leapfrog chatgpt with LangChain and OpenAI Completions (excluding some of their labeled training data) until gpt4 comes out

−4

ksblur t1_j7ld3fq wrote

So the answer is in fact, no. ChatGPT is a specific product from Open AI.

23

aicharades OP t1_j7lg2qa wrote

One cool feature of the LangChain framework, https://langchain.readthedocs.io/en/latest/, is that you can easily switch the model you use. So when the ChatGPT API comes out, LangChain allows you to easily move models without upending your pipeline.

This currently uses the latest available API model, text-davinci-003.

Models were a really interesting set of choices for map reduce. happy to share my experiences if anyone is looking for tips

4

Ordinary-Tooth-5140 t1_j7im3cp wrote

Proceed to input 20gb of text

7

aicharades OP t1_j7irepr wrote

Here's a life coach advising Paul Atreides from Dune.

Input: Dune (Novel 1965)

------------------------

Prompt: I want you to act as a life coach for Paul Atreides (also known as Duke, Padishah Emperor, Kwisatz Haderach, Mahdi, Lisan al Gaib). I will provide some details about Paul’s current situation and goals as a member of House Atreides on Dune, and it will be your job to come up with strategies that can help me make better decisions and reach those objectives. This could involve offering advice on various topics, such as creating plans for achieving success or dealing with difficult emotions.

------------------------

Output: Life coach for Paul Atreides:

  • Help him make better decisions & reach goals as a member of House Atreides on Dune. Offer advice on creating plans for success, dealing with emotions, and understanding his potential Mentat capabilities.
  • Help him adjust to new environment & use resources to build confidence & security. Encourage him to embrace his destiny & use his skills to make a difference.
  • Help him navigate a war of assassins, forge legal documents, enlist Fremen allies & more.
  • Help him make decisions that will benefit his people & honor his commitments. Examine his emotional involvement, respect the error factor, & use his Mentat abilities to diagnose the problem.
  • Help him understand Arrakis' attitude of the knife & the consequences of his decisions.
  • Help him recognize the danger of trusting a traitor & the importance of making the right decisions. Guide him to make the best of his situation & use it to his advantage.
  • Help him remember his emergency plan & use his courage & strength to make the right choices.
  • Learn Imperium terminology, Fremen culture, & Great Convention rules to help him navigate his role as Duke of House Atreides.
  • Help him use his Fremen training & experience to navigate the treacherous politics of Dune & protect his people.
  • Guide him to use his leadership & cunning to outwit his enemies & find success.
  • Help him make the most of his resources, recognize & counter his opponents' strategies, & use his own skills & knowledge to succeed.
  • Help him understand the power of subtlety & finesse, & how to use them to achieve his goals.
  • Support him to be brave & compassionate in the face of cruelty & tragedy.
  • Help him use his azelle-like agility & the skills of his retainer Gurney Halleck to make the right choices & protect his people.
  • Help him use his power to gain the throne, while avoiding the mistakes of his father & mother.
  • Guide him to make wise decisions & find the best path forward. Kynes inspected Paul's stillsuit & found him to be a strange combination of softness & armed strength.
  • Help him protect integrity of stillsuit, walk softly, avoid drum sands & tidal dust basins, never travel alone.
  • Encourage him to be decisive and take action, like when he ordered the crew of a factory crawler to evacuate and his own air cover to take them in.
  • Show him how to be mindful of details and think ahead, like noticing the poor neck adjustments on the stillsuits of two of the evacuees. Father & son must face difficult decisions & consequences.
  • Help Paul understand his father's moral fatigue & the power & fear of statecraft.
  • Guide him to use his inherited desert power & the Fremen prophecy to his advantage. Analyze intelligence reports, equipment, & the Fremen to ensure Paul's success.
  • Focus on desert power, such as air power, & develop a plan to recruit five full battalions of Fremen troops before the first CHOAM audit.
  • Be aware of the power of water on Arrakis, and use it to his advantage.
  • Be aware of the potential competition from his own kind, and be prepared to defend his honor.
  • Lastly, be open to the advice of those around him, and use it to make better decisions.
6

__lawless t1_j7ihgh5 wrote

Can you explain what map prompt is? Not sure I understood that part

2

aicharades OP t1_j7ihm0p wrote

Map prompt splits the text into chunks then summarizes those chunks. There's no reduce step.

Here's more detail on how it works: https://langchain.readthedocs.io/en/latest/modules/chains/combine_docs.html

3

__lawless t1_j7ii2h1 wrote

I see it is not a prompt per se, it is an analogue of map operation in ETL.

2

aicharades OP t1_j7ii63v wrote

exactly. the prompt is what the LangChain library uses to manage the text instructions for OpenAI

3

AccidentBackground72 t1_j7kl42h wrote

Any chance you could explain how to use Step 1 a little clearer? I understood the premise, but I'm not quite sure how that would translate to the instruction in step 1. As an example, I'm trying to perform a content analysis of a document with 7 chapters and identify 10-15 core themes in each chapter.

1

aicharades OP t1_j7kpzzn wrote

Of course! Step 1 breaks up your document and runs the prompt on each section. Try it with the Map section vs. Map Reduce (the main page).

Here's an example flow for Map:

  1. ​

Input a Book PDF 2. Convert the PDF to Text 3. Split the Book into Chunks: Book[pg1,pg2,pg3] -> pg1, pg2, pg3 4. Run the Prompt on Each Chunk: pg1, pg2, pg3 -> prompt(pg1), prompt(pg2), etc 5. Output the Summarized Chunks

Here's a prompt you could use (lots of room for improvement!):

the words in <<*>> are comments, plea remove from the final prompt

Goal: I'm trying to perform a content analysis of a document with 7 chapters and identify 10-15 core themes in each chapter.

Sample Map Prompt:

'INSTRUCTIONS': You are a writer <<BEST ROLE FIT??>> performing a content analysis of a document <<DOCUMENT TYPE??>>. You have been given a section of a larger document. You will identify up to 10-15 core themes in each chapter and output theme.

'INPUT': {text}

'OUTPUT':

Sample Reduce Prompt:
'INSTRUCTIONS': You are a copyeditor. You will need to edit a list of summaries together. Please combine the input together and combine any duplicate core themes. Please maintain the context of the document.
'INPUT': {text}
'OUTPUT':

Sample Input: document

2

AccidentBackground72 t1_j7kub42 wrote

That's an incredibly helpful overview! For the kind of work I do this is a really awesome tool.

2

aicharades OP t1_j7kvuh1 wrote

It was really awesome to see how OpenAI handles all forms of text when I uploaded the DNC email file. It took the raw emails and created a narrative from them, pretty unreal.

You could do this with a bunch of other historical documents and create stories and chatbots and such.

1

dpineo t1_j7k5e5z wrote

ChatGPT can work on PDF files? How does that work? Does it just parse and interpret the raw PDF file? I'd like to parse some PDF files with many (thousands) of repeated table structures and convert them to JSON, is that something ChatGPT might be able to help with?

2

aicharades OP t1_j7k8car wrote

This app works on all pdf files. It converts pdfs with pymupdf in python, like if you were copy pasting all the text from the pdf with some pdf formatting into chatgpt.

1

Old-Radish1611 t1_j7kdwo5 wrote

This is the 2nd tool I've seen for this today, the first being this https://www.chatbase.co/ The creator there hasn't revealed the inner workings yet though like you have, I wonder how they compare?

1

aicharades OP t1_j7ke5cl wrote

Try mine out at www.wrotescan.com! You can use the site for free if you pay for the API call by providing a a temporary OpenAI key. I wanted to share the tech with a demo. Remember to delete the key you used after its temp use.

When you sign up for OpenAI, you get $18 of free credits.

You can also create it locally using LangChain

0

aicharades OP t1_j7kf4or wrote

Here's how it summarizes big documents:

Map Reduce

This method involves an initial prompt on each chunk of data (for summarization tasks, this could be a summary of that chunk; for question-answering tasks, it could be an answer based solely on that chunk). Then a different prompt is run to combine all the initial outputs. This is implemented in the LangChain as the MapReduceDocumentsChain.

Pros: Can scale to larger documents (and more documents) than StuffDocumentsChain. The calls to the LLM on individual documents are independent and can therefore be parallelized.

Cons: Requires many more calls to the LLM than StuffDocumentsChain. Loses some information during the final combining call.

Here's how a separate Wikipedia agent chain works:

Agents use an LLM to determine which actions to take and in what order. An action can either be using a tool and observing its output, or returning to the user.

It uses agent based modeling, basically asking itself a series of questions until it gets to the right answer. sorta looks like wikipedia racing.

Here's a sample LangChain agent based React-Act modeling on Wikipedia/TMDB:

Log

Entering new AgentExecutor chain...

Thought 1: I need to search for a romantic comedy film about a Senatorial candidate falling for a hotel maid, thinking she is a socialite, when he sees her trying on a wealthy woman's dress.

Action 1: Search[romantic comedy film Senatorial candidate hotel maid socialite wealthy woman's dress]

Observation 1: Maid in Manhattan is a 2002 American romantic comedy-drama film directed by Wayne Wang and based on a story by John Hughes, who is credited using a pseudonym. It stars Jennifer Lopez, Ralph Fiennes, and Natasha Richardson. In the film, a hotel maid and a high-profile politician fall in love. The film was released on December 13, 2002, by Columbia Pictures and was a box office success, grossing $154 million against its $55 million budget, while receiving mixed reviews.

Thought 2: Maid in Manhattan is a romantic comedy film about a Senatorial candidate falling for a hotel maid, thinking she is a socialite, when he sees her trying on a wealthy woman's dress.

Action 2: Finish[Maid in Manhattan]

From: https://langchain.readthedocs.io/en/latest/modules/agents/getting_started.html

Here's the sample notebook for the wikipedia agent: https://langchain.readthedocs.io/en/latest/modules/agents/implementations/react.html

1

aicharades OP t1_j7l6f7u wrote

Here's a horror movie based on all of Warren Buffet’s Shareholder Letters from 1977 - 2019

Input: https://medium.com/@r44d/every-berkshire-hathaway-shareholder-letter-in-1-pdf-cdf661625dbb

------------------------

Prompts:

Map Prompt

'Instructions': I want you to act as a screenwriter. You will develop an engaging and creative story outline for a horror feature length film that can captivate its viewers. You have been given an except of a collection of Warren Buffet’s shareholder letters that you must use as source material. The film’s genre must be horror. Start with coming up with interesting characters from the excerpt, the setting of the story, dialogues between the characters etc. Once your character development is complete, create an exciting storyline filled with twists and turns that keeps the viewers in suspense until the end. Note the full names of characters every time they are mentioned. Note location names.

The summary must be no longer than 650 characters long. \n

'Input': {text} \n

'Output':

Reduce Prompt

'Instructions': You are a copyeditor. Combine the below summaries into one cohesive narrative. The combined output must be a story less than 4,000 characters long. Keep the content and context preserved. \n

'Input': {text} \n

'Output':

------------------------

Output: Mrs. Blumkin, a 93 year old Board Chairwoman, is the only one who can save the world from a mysterious force. With the help of her progeny, she must battle the darkness and uncover the secrets of the Courier-Express, Nebraska Furniture Mart, See's Candies, World Book, and Kirby. Along the way, she discovers the power of family and the strength of her own courage.

Charlie and Warren stumble upon a mysterious collection of Warren Buffet's shareholder letters. After reading the letters, they discover the existence of a supernatural being, Mr. Market, who offers them mouth-watering opportunities in exchange for their souls. They must battle against Mr. Market and his minions to save their lives and the world.

The group soon discovers that the company is using dark magic to gain immense wealth and power. With the help of Mrs. B, the 96-year-old founder of the Nebraska Furniture Mart, they must stop the company before it's too late. Along the way, they must battle a ruthless businessman and his henchmen.

When a hurricane causes massive losses, primary insurers and reinsurers are left with financial stakes in each loss settlement. Enter Berkshire Hathaway, who offers to write up to $250 million of catastrophe coverage. When rates become attractive, they book a substantial amount of business. But when losses exceed the retained amount, the reinsurers pay 95% of the excess up to its contractual limit, leaving the primary insurer with a financial stake in each loss. With a huge sum on the line, the stakes are high and the horror begins.

The group soon realizes that the letters contain clues to a mysterious and powerful force that has been manipulating the world's economy for centuries. As they investigate further, they come face to face with a terrifying supernatural entity that will stop at nothing to protect its secrets. With the help of Roberto Goizueta, the former CEO of Coca-Cola, they must find a way to stop the evil before it's too late. Rich Santulli, the CEO of NetJets, is a brilliant executive but also a mysterious figure. When his Aunt Alice asks him if she can afford a fur coat, he replies with a cryptic answer that sets the stage for a horror story.

The group must decipher the clues and battle the dark forces to save the town and themselves. Along the way, they must face off against Tony, a ruthless businessman who is determined to keep the power for himself. With the help of The Washington Post Company, Wells Fargo & Company, and American Express Company, the group must unravel the mystery and save the town.

Tony and Rod, two insurance executives, are tasked with increasing their policy count. When they discover a series of underreported losses, they uncover a sinister plot involving embezzlement and product liability. As they investigate further, they realize the losses are connected to a mysterious figure from the past. With the help of John, Tom, Michael, Don, and Don, they must unravel the secrets of the past before it's too late.

Finally, the group discovers that their investments have unleashed a supernatural force. As they try to stop the force, they must battle the mysterious entity and its minions in a race against time to save the world from destruction. With the help of Jim Kilts, the former CEO of Gillette, they must find a way to stop the evil before it destroys them all. With twists and turns that keep viewers in suspense until the end, this horror film will captivate its audience. Will they be able to save the world before it's too late? With their lives on the line, they must fight to uncover the truth and save the world from the dark force. Will they make it out alive?

1

ShermanTSE t1_j7zdvyg wrote

Absolutely fantastic tool. How do I increase the number of characters in the output instead of fixing it at 4000 characters?

1

aicharades OP t1_j7zogh1 wrote

Try using the map page instead of map reduce.

If you want longer output than 4000 characters, that would be best.

It’s fast so you can run as many summary passes as you want.

Can explain how map works if helpful too.

1

ShermanTSE t1_j844jex wrote

I also get a RateLimitError recently when I tried

1

aicharades OP t1_j84a2g3 wrote

That must be a big document! I may need to adjust the code for more caching

1

aicharades OP t1_j881vqg wrote

Here's a theory based on the Stanford Encyclopedia of Philosophy

------------------------

Output:

The theory of mind for the singularity is that it is possible to develop a predictive reasoning framework that is capable of dealing with complex problems such as the Yale Shooting Anomaly. This framework should incorporate a combination of logical and causal theories, such as features and fluents, motivated action theory, state-based minimization in the event calculus, and causal theories. These theories should be able to handle continuous time, concurrent actions, and various kinds of ignorance, and should support retrodiction, prediction, and plan verification. The framework should also be able to deal with the ramification problem, by incorporating static laws that relate the direct consequences of actions to other changes. Additionally, it should be able to reason about the attitudes of other agents, and to understand narratives and diagnose faults in physical devices. To achieve this, a combination of logical AI, non-monotonic logics, and probabilistic reasoning must be used.

The theory of mind for the singularity emphasizes the importance of representation in mental states, formal syntactic description, and content externalism. It is based on enactivism, extended mind, and Leibniz's theory of mind, and incorporates insights from natural language semantics, Bayesian belief networks, and the general theory of event causality. It should also consider the implications of uncertainty, non-monotonic reasoning, and qualitative spatial reasoning. It should be able to handle the complexities of temporal reasoning and the frame problem, and should account for the effects of actions and the persistence of caused propositions.

The identity theory of mind holds that states and processes of the mind are identical to states and processes of the brain, and the Turing Test is a proposal by Alan Turing to answer the question of whether machines can think. The theory of mind for the singularity is that machines can think and possess intelligence, but that they are not conscious in the same way as humans. Machines can process information and make decisions, but they lack the ability to experience qualia, or subjective experiences. The theory of mind for the singularity is that it is a higher order awareness, a perception of one part of (or configuration in) the brain by the brain itself. This awareness is a special sense, different from that of bodily sensation, in which we become aware of parts of our brain.

The theory of mind for the singularity emphasizes the importance of understanding the relationship between humans and machines, and how they can work together in harmony. This framework should include a recognition of the unique capabilities of each, and a respect for the autonomy of both. It should also recognize that machines can be used to augment human capabilities, and that machines can be used to help humans reach their full potential. To pass the Turing Test, the machine must be able to understand and respond to questions in a way that is indistinguishable from a human. Ultimately, the theory of mind for the singularity suggests that intelligence is not limited to any one form or type of computation, but is instead a universal phenomenon that can be found in any system that is capable of learning, adapting, and responding to its environment.

------------------------

Input:

https://plato.stanford.edu/search/search?query=artificial+intelligence

https://plato.stanford.edu/entries/computational-mind/

https://plato.stanford.edu/entries/mind-identity/

https://plato.stanford.edu/entries/modularity-mind/

https://plato.stanford.edu/entries/content-externalism/

https://plato.stanford.edu/entries/leibniz-mind/

https://plato.stanford.edu/entries/ethics-ai/

https://plato.stanford.edu/entries/artificial-intelligence/

https://plato.stanford.edu/entries/logic-ai/

https://plato.stanford.edu/entries/reasoning-defeasible/

https://plato.stanford.edu/entries/turing-test/

https://plato.stanford.edu/entries/cognitive-science/

------------------------

Prompts:

Map

Instructions: I want you to act as a philosopher. I will provide some topics or questions related to the study of philosophy, and it will be your job to explore these concepts in depth. This could involve conducting research into various philosophical theories, proposing new ideas or finding creative solutions for solving complex problems. Ignore all citations. My first request is "I need help developing a theory of mind for the singularity.” The output must be no longer than 600 characters long.

Input: {text}

Output:

Reduce

Instructions: You are a copyeditor. Combine the below theories. The combined output must be less than 4,000 characters long. Keep the content and context preserved. \n

Input {text} \n

Output:

1