/start #19

tolcipularang02 · 2025-01-05T17:00:03Z

#NIK

tolcipularang02 · 2025-01-05T17:01:13Z

#NIK

tolcipularang02 · 2025-01-05T17:01:34Z

#NIK

An0n-xen · 2025-01-07T22:25:15Z

Hello I'm trying to run the code but I keep running into the error
TypeError: Cannot read properties of undefined (reading 'stories')
at C:\Users\Hawis\Documents\Projects\Personal\trendFinder\src\services\scrapeSources.ts:128:33
at Generator.next ()
at fulfilled (C:\Users\Hawis\Documents\Projects\Personal\trendFinder\src\services\scrapeSources.ts:5:58)
at processTicksAndRejections (node:internal/process/task_queues:95:5)
[nodemon] clean exit - waiting for changes before restart

An0n-xen · 2025-01-07T22:25:41Z

Is this due to the twitter api key, I'm using a free X api key

An0n-xen · 2025-01-07T22:35:36Z

Never mind I've fixed it

hsmnzaydn · 2025-01-07T23:18:59Z

Never mind I've fixed it

How did you resolved it?

An0n-xen · 2025-01-07T23:25:05Z

Ok, the todayStories in scrapeSources.ts was returning none giving me that error, Initially I thought I set the wrong fire crawl api key but it was the right I api key

so I just added and if block to check if I receive any data

I will send a code sample

An0n-xen · 2025-01-07T23:25:51Z

if (todayStories && todayStories.stories) {
console.log(
Found ${todayStories.stories.length} stories from ${source}
);
combinedText.stories.push(...todayStories.stories);
} else {
console.log(No valid stories data found from ${source});
}

hsmnzaydn · 2025-01-07T23:26:53Z

@An0n-xen thanks so much you saved my day <3

An0n-xen · 2025-01-07T23:28:06Z

but in my firecrawl is show it made those requests, however I still receive none response

hsmnzaydn · 2025-01-07T23:32:07Z

but in my firecrawl is show it made those requests, however I still receive none response

I have same problem :( If I will fix share my solution

An0n-xen · 2025-01-07T23:32:51Z

after adding the if check to fix the none issue this is what I get

An0n-xen · 2025-01-07T23:33:08Z

but in my firecrawl is show it made those requests, however I still receive none response

I have same problem :( If I will fix share my solution

sure would really appreciate that

i-am-henri · 2025-01-08T19:50:18Z

I'm facing the same problem @An0n-xen and @hsmnzaydn. There is an problem with the extract methode, but you can use another alternative to this, so this is the new solution:

import FirecrawlApp from '@mendable/firecrawl-js';
import dotenv from 'dotenv';
// Removed Together import
import { z } from 'zod';
// Removed zodToJsonSchema import since we no longer enforce JSON output via Together

dotenv.config();

// Initialize Firecrawl
const app = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY });

// 1. Define the schema for our expected JSON
const StorySchema = z.object({
  headline: z.string().describe("Story or post headline"),
  link: z.string().describe("A link to the post or story"),
  date_posted: z.string().describe("The date the story or post was published"),
});

const StoriesSchema = z.object({
  stories: z.array(StorySchema).describe(
    "A list of today's AI or LLM-related stories"
  ),
});

export async function scrapeSources(sources: string[]) {
  const num_sources = sources.length;
  console.log(`Scraping ${num_sources} sources...`);

  let combinedText: { stories: any[] } = { stories: [] };

  // Configure these if you want to toggle behavior
  const useTwitter = false;
  const useScrape = true;

  for (const source of sources) {
    // --- 1) Handle x.com (Twitter) sources ---
    if (source.includes("x.com")) {
      if (useTwitter) {
        const usernameMatch = source.match(/x\.com\/([^\/]+)/);
        if (usernameMatch) {
          const username = usernameMatch[1];

          // Build the search query for tweets
          const query = `from:${username} has:media -is:retweet -is:reply`;
          const encodedQuery = encodeURIComponent(query);

          // Get tweets from the last 24 hours
          const startTime = new Date(
            Date.now() - 24 * 60 * 60 * 1000
          ).toISOString();
          const encodedStartTime = encodeURIComponent(startTime);

          // x.com API URL
          const apiUrl = `https://api.x.com/2/tweets/search/recent?query=${encodedQuery}&max_results=10&start_time=${encodedStartTime}`;

          // Fetch recent tweets from the Twitter API
          const response = await fetch(apiUrl, {
            headers: {
              Authorization: `Bearer ${process.env.X_API_BEARER_TOKEN}`,
            },
          });

          if (!response.ok) {
            throw new Error(`Failed to fetch tweets for ${username}: ${response.statusText}`);
          }

          const tweets = await response.json();

          if (tweets.meta?.result_count === 0) {
            console.log(`No tweets found for username ${username}.`);
          } else if (Array.isArray(tweets.data)) {
            console.log(`Tweets found from username ${username}`);
            const stories = tweets.data.map((tweet: any) => {
              return {
                headline: tweet.text,
                link: `https://x.com/i/status/${tweet.id}`,
                date_posted: startTime,
              };
            });
            combinedText.stories.push(...stories);
          } else {
            console.error(
              "Expected tweets.data to be an array:",
              tweets.data
            );
          }
        }
      }
    }
    // --- 2) Handle all other sources with Firecrawl extract ---
    else {
      if (useScrape) {
        // Firecrawl will both scrape and extract for you
        // Provide a prompt that instructs Firecrawl what to extract
        const currentDate = new Date().toLocaleDateString();
        const promptForFirecrawl = `
        Return only today's AI or LLM related story or post headlines and links in JSON format from the page content.
        They must be posted today, ${currentDate}. The format should be:
        {
          "stories": [
            {
              "headline": "headline1",
              "link": "link1",
              "date_posted": "YYYY-MM-DD"
            },
            ...
          ]
        }
        If there are no AI or LLM stories from today, return {"stories": []}.
        
        The source link is ${source}. 
        If a story link is not absolute, prepend ${source} to make it absolute. 
        Return only pure JSON in the specified format (no extra text, no markdown, no \`\`\`). 
        `;
        console.log("get the post")
        // !! new method
        const scrapeResult = await app.scrapeUrl(source, {
          formats: ["extract"],
          extract: {
            prompt: promptForFirecrawl,
            schema: StoriesSchema
          }
        });

        if (!scrapeResult.success || !scrapeResult.extract?.stories) {
          throw new Error(`Failed to scrape: ${scrapeResult.error}`);
        }

        // The structured data
        const todayStories = scrapeResult.extract;
        console.log(todayStories)
        if (todayStories && todayStories.stories) {
          console.log(
            `Found ${todayStories.stories.length} stories from ${source}`
          );
          combinedText.stories.push(...todayStories.stories);
        } else {
          console.log(`No valid stories data found from ${source}`);
        }
      }
    }
  }

  // Return the combined stories from all sources
  const rawStories = combinedText.stories;
  console.log(rawStories);
  return rawStories;
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/start #19

/start #19

tolcipularang02 commented Jan 5, 2025 •

edited

Loading

tolcipularang02 commented Jan 5, 2025

tolcipularang02 commented Jan 5, 2025

An0n-xen commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

hsmnzaydn commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

hsmnzaydn commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

hsmnzaydn commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

i-am-henri commented Jan 8, 2025

/start #19

/start #19

Comments

tolcipularang02 commented Jan 5, 2025 • edited Loading

tolcipularang02 commented Jan 5, 2025

tolcipularang02 commented Jan 5, 2025

An0n-xen commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

hsmnzaydn commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

hsmnzaydn commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

hsmnzaydn commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

An0n-xen commented Jan 7, 2025

i-am-henri commented Jan 8, 2025

tolcipularang02 commented Jan 5, 2025 •

edited

Loading