Python: Script A Google Autosuggest Joerak Zure Laburpena Bilaketa Hitzetarako

Python Script-a Autosuggest Joerak Atxikitzeko

Denei gustatzen zaie Google Trends, baina pixka bat korapilatsua da Long Tail Key hitzei dagokienez. Guztiok dugu gustuko ofiziala google joeren zerbitzua bilaketaren portaerari buruzko argibideak jasotzeagatik. Hala ere, bi gauzek askok lan sendoetarako erabiltzea eragozten dute;

  1. Aurkitu behar duzunean nitxo hitz gako berriak, han ez da nahikoa datu Google Trends-en 
  2. Google joerei eskaerak egiteko API ofizialik eza: bezalako moduluak erabiltzen ditugunean pirtrends, proxy zerbitzariak erabili behar ditugu, edo blokeatuta geratuko gara. 

Artikulu honetan, idatzitako Python Script bat partekatuko dut, Google Autosuggest bidez joera-gako hitzak esportatzeko.

Eskuratu eta gorde Autosuggest emaitzak denboran zehar 

Demagun 1,000 hazi hitz gakoak ditugula Google Autosuggest-era bidaltzeko. Ordainetan, seguruenik 200,000 inguru lortuko ditugu buztan luzea hitz gakoak. Ondoren, gauza bera egin behar dugu astebete geroago eta datu multzo hauek alderatu bi galderei erantzuteko:

  • Zein kontsulta dira hitz gako berriak azken aldiarekin alderatuta? Ziurrenik behar dugun kasua da. Google-k uste du kontsulta horiek esanguratsuagoak direla - horrela, gure Google Autosuggest irtenbidea sor dezakegu! 
  • Zein kontsulta dira gako-hitzak jada ez joera?

Gidoia nahiko erraza da, eta partekatu nuen kode gehiena hemen. Eguneratutako kodeak iraganeko exekuzioetako datuak gordetzen ditu eta denboran zehar iradokizunak alderatzen ditu. SQLite bezalako fitxategietan oinarritutako datu baseak saihestu ditugu sinplifikatzeko, beraz, datu biltegiratze guztia beheko CSV fitxategiak erabiltzen ditu. Horrek fitxategia Excel-en inportatzea eta zure negozioaren hitz gakoen joera arakatzea ahalbidetzen du.

Python Script hau erabiltzeko

  1. Idatzi osatze automatikora bidali behar den hazia den gako-hitz multzoa: keywords.csv
  2. Doitu Script ezarpenak zure beharren arabera:
    • HIZKUNTZA: "eu" lehenetsia
    • HERRIALDEA: "gu" lehenetsia
  3. Antolatu gidoia astean behin exekutatzeko. Eskuz ere exekutatu dezakezu nahi duzun moduan.
  4. Erabili keyword_suggestions.csv gehiago aztertzeko:
    • lehen_ikusi: hau da kontsulta autozerbitzuan lehenengo aldiz agertu zeneko data
    • Azkenekoz ikusia: kontsulta azkeneko aldiz ikusi zeneko data
    • berria_ da: lehen_ikusten bada == azkena_ikusten bada hau ezarri dugu Egia - Iragazi besterik ez duzu balio hau Google-ren iradokizun automatikoko joera berrien bilaketak lortzeko.

Hona hemen Python kodea

# Pemavor.com Autocomplete Trends
# Author: Stefan Neefischer (stefan.neefischer@gmail.com)
import concurrent.futures
from datetime import date
from datetime import datetime
import pandas as pd
import itertools
import requests
import string
import json
import time

charList = " " + string.ascii_lowercase + string.digits

def makeGoogleRequest(query):
    # If you make requests too quickly, you may be blocked by google 
    time.sleep(WAIT_TIME)
    URL="http://suggestqueries.google.com/complete/search"
    PARAMS = {"client":"opera",
            "hl":LANGUAGE,
            "q":query,
            "gl":COUNTRY}
    response = requests.get(URL, params=PARAMS)
    if response.status_code == 200:
        try:
            suggestedSearches = json.loads(response.content.decode('utf-8'))[1]
        except:
            suggestedSearches = json.loads(response.content.decode('latin-1'))[1]
        return suggestedSearches
    else:
        return "ERR"

def getGoogleSuggests(keyword):
    # err_count1 = 0
    queryList = [keyword + " " + char for char in charList]
    suggestions = []
    for query in queryList:
        suggestion = makeGoogleRequest(query)
        if suggestion != 'ERR':
            suggestions.append(suggestion)

    # Remove empty suggestions
    suggestions = set(itertools.chain(*suggestions))
    if "" in suggestions:
        suggestions.remove("")
    return suggestions

def autocomplete(csv_fileName):
    dateTimeObj = datetime.now().date()
    #read your csv file that contain keywords that you want to send to google autocomplete
    df = pd.read_csv(csv_fileName)
    keywords = df.iloc[:,0].tolist()
    resultList = []

    with concurrent.futures.ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
        futuresGoogle = {executor.submit(getGoogleSuggests, keyword): keyword for keyword in keywords}

        for future in concurrent.futures.as_completed(futuresGoogle):
            key = futuresGoogle[future]
            for suggestion in future.result():
                resultList.append([key, suggestion])

    # Convert the results to a dataframe
    suggestion_new = pd.DataFrame(resultList, columns=['Keyword','Suggestion'])
    del resultList

    #if we have old results read them
    try:
        suggestion_df=pd.read_csv("keyword_suggestions.csv")
        
    except:
        suggestion_df=pd.DataFrame(columns=['first_seen','last_seen','Keyword','Suggestion'])
    
    suggestionCommon_list=[]
    suggestionNew_list=[]
    for keyword in suggestion_new["Keyword"].unique():
        new_df=suggestion_new[suggestion_new["Keyword"]==keyword]
        old_df=suggestion_df[suggestion_df["Keyword"]==keyword]
        newSuggestion=set(new_df["Suggestion"].to_list())
        oldSuggestion=set(old_df["Suggestion"].to_list())
        commonSuggestion=list(newSuggestion & oldSuggestion)
        new_Suggestion=list(newSuggestion - oldSuggestion)
         
        for suggest in commonSuggestion:
            suggestionCommon_list.append([dateTimeObj,keyword,suggest])
        for suggest in new_Suggestion:
            suggestionNew_list.append([dateTimeObj,dateTimeObj,keyword,suggest])
    
    #new keywords
    newSuggestion_df = pd.DataFrame(suggestionNew_list, columns=['first_seen','last_seen','Keyword','Suggestion'])
    #shared keywords with date update
    commonSuggestion_df = pd.DataFrame(suggestionCommon_list, columns=['last_seen','Keyword','Suggestion'])
    merge=pd.merge(suggestion_df, commonSuggestion_df, left_on=["Suggestion"], right_on=["Suggestion"], how='left')
    merge = merge.rename(columns={'last_seen_y': 'last_seen',"Keyword_x":"Keyword"})
    merge["last_seen"].fillna(merge["last_seen_x"], inplace=True)
    del merge["last_seen_x"]
    del merge["Keyword_y"]
    
    #merge old results with new results
    frames = [merge, newSuggestion_df]
    keywords_df =  pd.concat(frames, ignore_index=True, sort=False)
    # Save dataframe as a CSV file
    keywords_df['first_seen'] = pd.to_datetime(keywords_df['first_seen'])
    keywords_df = keywords_df.sort_values(by=['first_seen','Keyword'], ascending=[False,False])   
    keywords_df['first_seen']= pd.to_datetime(keywords_df['first_seen'])
    keywords_df['last_seen']= pd.to_datetime(keywords_df['last_seen'])
    keywords_df['is_new'] = (keywords_df['first_seen']== keywords_df['last_seen'])
    keywords_df=keywords_df[['first_seen','last_seen','Keyword','Suggestion','is_new']]
    keywords_df.to_csv('keyword_suggestions.csv', index=False)

# If you use more than 50 seed keywords you should slow down your requests - otherwise google is blocking the script
# If you have thousands of seed keywords use e.g. WAIT_TIME = 1 and MAX_WORKERS = 5
WAIT_TIME = 0.2
MAX_WORKERS = 20
# set the autocomplete language
LANGUAGE = "en"
# set the autocomplete country code - DE, US, TR, GR, etc..
COUNTRY="US"
# Keyword_seed csv file name. One column csv file.
#csv_fileName="keyword_seeds.csv"
CSV_FILE_NAME="keywords.csv"
autocomplete(CSV_FILE_NAME)
#The result will save in keyword_suggestions.csv csv file

Deskargatu Python Script-a

Zer deritzozu?

Gune honek Akismet-ek spam erabiltzen du. Ikasi zure iruzkina nola prozesatu den.